{
  "metadata": {
    "kernelspec": {
      "language": "python",
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "name": "python",
      "version": "3.7.12",
      "mimetype": "text/x-python",
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "pygments_lexer": "ipython3",
      "nbconvert_exporter": "python",
      "file_extension": ".py"
    },
    "colab": {
      "name": "Lab09_Relational_Database_and_data_wrangling.ipynb",
      "provenance": [],
      "toc_visible": true,
      "collapsed_sections": []
    }
  },
  "nbformat_minor": 0,
  "nbformat": 4,
  "cells": [
    {
      "cell_type": "markdown",
      "source": [
        "**Lab 9 – Relational Database and data wrangling**"
      ],
      "metadata": {
        "id": "1QiCFLer1FIe"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "_This notebook contains the sample from https://www.kaggle.com/learn/, https://github.com/ageron/handson-ml2_ and https://github.com/wesm/pydata-book."
      ],
      "metadata": {
        "id": "vCyq3-8y1FIj"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "<table align=\"left\">\n",
        "  <td>\n",
        "    <a href=\"https://colab.research.google.com/github/phonchi/nsysu-math604/blob/master/static_files/presentations/08_Dataset.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://kaggle.com/kernels/welcome?src=https://github.com/phonchi/nsysu-math604/blob/master/static_files/presentations/08_Dataset.ipynb\"><img src=\"https://kaggle.com/static/images/open-in-kaggle.svg\" /></a>\n",
        "  </td>\n",
        "</table>"
      ],
      "metadata": {
        "id": "9J5g6PDs1FIk"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "from google.cloud import bigquery\n",
        "import numpy as np\n",
        "import pandas as pd\n",
        "\n",
        "import matplotlib as mpl\n",
        "from matplotlib import pyplot as plt\n",
        "%matplotlib inline"
      ],
      "metadata": {
        "id": "5bV_HvPiH-9i",
        "execution": {
          "iopub.status.busy": "2022-04-23T03:33:52.948105Z",
          "iopub.execute_input": "2022-04-23T03:33:52.948724Z",
          "iopub.status.idle": "2022-04-23T03:33:52.976140Z",
          "shell.execute_reply.started": "2022-04-23T03:33:52.948606Z",
          "shell.execute_reply": "2022-04-23T03:33:52.975047Z"
        },
        "trusted": true
      },
      "execution_count": 1,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Queringing data with bigquery"
      ],
      "metadata": {
        "id": "6sb3RBiSo4Vf"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "Structured Query Language, or SQL, is the programming language used with databases, and it is an important skill for any data scientist. In this example, you'll build your SQL skills using BigQuery, a web service work as database management system that lets you apply SQL to huge datasets."
      ],
      "metadata": {
        "id": "qvr5MUX8pFwO"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "### Preliminaries for google colab (optional)\n",
        "\n",
        "We want to start exploring the Google BiqQuery [public datasets](https://cloud.google.com/bigquery/public-data/). Let's start by walking through the required setup steps, and then we can load and explore some data.\n",
        "\n",
        "If you are using colab. Follow [this quickstart guide](https://cloud.google.com/bigquery/docs/quickstarts/quickstart-client-libraries), which will explain how to:\n",
        "1. Create a [Cloud Platform project](https://console.cloud.google.com/cloud-resource-manager) if you don't have one already.\n",
        "2. [Enable billing](https://support.google.com/cloud/answer/6293499#enable-billing) for the project\n",
        "3. [Enable the BigQuery API](https://console.cloud.google.com/flows/enableapi?apiid=bigquery)\n",
        "4. [Enabling the Service account](https://cloud.google.com/docs/authentication/getting-started)\n",
        "\n",
        "Now we need to authenticate to gain access to the BigQuery API. We will create a client, specifying the service account key file (replace 'utopian-datum-340514-9ffc23108bf4.json' with your key file)."
      ],
      "metadata": {
        "id": "OvNovWe2Hobw"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "from google.oauth2 import service_account\n",
        "\n",
        "# TODO(developer): Set key_path to the path to the service account key\n",
        "#                  file.\n",
        "\n",
        "key_path = \"utopian-datum-340514-9ffc23108bf4.json\"\n",
        "\n",
        "credentials = service_account.Credentials.from_service_account_file(\n",
        "    key_path, scopes=[\"https://www.googleapis.com/auth/cloud-platform\"],\n",
        ")"
      ],
      "metadata": {
        "id": "CLZXPyBqcysR"
      },
      "execution_count": 5,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Now that we're authenticated, we need to load the BigQuery packag, and the `google.colab.data_table` package that can be used to display large pandas dataframes as an interactive data. Loading `data_table` is optional, but it will be useful for working with data in pandas."
      ],
      "metadata": {
        "id": "CFpSwmOJIDS_"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "%load_ext google.cloud.bigquery\n",
        "%load_ext google.colab.data_table"
      ],
      "metadata": {
        "id": "BnVn82RDGyQu"
      },
      "execution_count": 6,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "client = bigquery.Client(credentials=credentials, project=credentials.project_id,)"
      ],
      "metadata": {
        "id": "vcmdamwKG3Et"
      },
      "execution_count": 23,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "### Create the reference\n"
      ],
      "metadata": {
        "id": "1UK6R1kBIaPC"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "You can also work with Kaggle, which provide bigquery integration that you do not need to setup a google account. **Each Kaggle user can scan 5TB every 30 days for free.  Once you hit that limit, you'll have to wait for it to reset.** See https://www.kaggle.com/product-feedback/48573 for more details.\n"
      ],
      "metadata": {
        "id": "ssJU-WK9MSSE"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "The first step in the workflow is to create a [`Client`](https://google-cloud.readthedocs.io/en/latest/bigquery/generated/google.cloud.bigquery.client.Client.html#google.cloud.bigquery.client.Client) object.  As you'll soon see, this `Client` object will play a central role in retrieving information from BigQuery datasets."
      ],
      "metadata": {
        "id": "YRCSulcqCL_o"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Create a \"Client\" object if you are using Kaggle\n",
        "client = bigquery.Client()"
      ],
      "metadata": {
        "id": "NLmMfMf4CO7t",
        "execution": {
          "iopub.status.busy": "2022-04-23T03:34:20.953692Z",
          "iopub.execute_input": "2022-04-23T03:34:20.953969Z",
          "iopub.status.idle": "2022-04-23T03:34:20.959198Z",
          "shell.execute_reply.started": "2022-04-23T03:34:20.953941Z",
          "shell.execute_reply": "2022-04-23T03:34:20.958252Z"
        },
        "trusted": true
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "We'll work with a dataset of posts on Hacker News, a website focusing on computer science and cybersecurity news. In BigQuery, each dataset is contained in a corresponding project. In this case, our `hacker_news` dataset is contained in the `bigquery-public-data project`. \n",
        "\n",
        "To access the dataset, We begin by constructing a reference to the dataset with the [`dataset()`](https://googleapis.dev/python/bigquery/latest/generated/google.cloud.bigquery.client.Client.html?highlight=dataset#google.cloud.bigquery.client.Client.dataset) method. Next, we use the [`get_dataset()`](https://googleapis.dev/python/bigquery/latest/generated/google.cloud.bigquery.client.Client.html?highlight=get_dataset#google.cloud.bigquery.client.Client.get_dataset) method, along with the reference we just constructed, to fetch the dataset.\n",
        "\n",
        "[See the full list of public datasets](https://console.cloud.google.com/marketplace/browse?filter=solution-type:dataset) or the [kaggle bigquery dataset](https://www.kaggle.com/datasets?search=bigquery) if you want to explore others."
      ],
      "metadata": {
        "id": "OgWXWB1ICXLi"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Construct a reference to the \"hacker_news\" dataset\n",
        "dataset_ref = client.dataset(\"hacker_news\", project=\"bigquery-public-data\")\n",
        "\n",
        "# API request - fetch the dataset\n",
        "dataset = client.get_dataset(dataset_ref)"
      ],
      "metadata": {
        "id": "oC7ldHp4CRf-",
        "execution": {
          "iopub.status.busy": "2022-04-23T03:34:22.839652Z",
          "iopub.execute_input": "2022-04-23T03:34:22.840484Z",
          "iopub.status.idle": "2022-04-23T03:34:23.269423Z",
          "shell.execute_reply.started": "2022-04-23T03:34:22.840431Z",
          "shell.execute_reply": "2022-04-23T03:34:23.268733Z"
        },
        "trusted": true
      },
      "execution_count": 7,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Every dataset is just a collection of tables. You can think of a dataset as a spreadsheet file containing multiple tables, all composed of rows and columns.We use the [`list_tables()`](https://googleapis.dev/python/bigquery/latest/generated/google.cloud.bigquery.client.Client.html?highlight=list_tables#google.cloud.bigquery.client.Client.list_tables) method to list the tables in the dataset."
      ],
      "metadata": {
        "id": "M8Xms4W_I1QQ"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# List all the tables in the \"hacker_news\" dataset\n",
        "tables = list(client.list_tables(dataset))\n",
        "\n",
        "# Print names of all tables in the dataset (there are four!)\n",
        "for table in tables:  \n",
        "    print(table.table_id)"
      ],
      "metadata": {
        "id": "M2gOPT8jCvCL",
        "execution": {
          "iopub.status.busy": "2022-04-23T03:34:25.195022Z",
          "iopub.execute_input": "2022-04-23T03:34:25.195282Z",
          "iopub.status.idle": "2022-04-23T03:34:25.507541Z",
          "shell.execute_reply.started": "2022-04-23T03:34:25.195254Z",
          "shell.execute_reply": "2022-04-23T03:34:25.506966Z"
        },
        "trusted": true,
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "a4400d27-47aa-4080-be31-6e174551b649"
      },
      "execution_count": 8,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "comments\n",
            "full\n",
            "full_201510\n",
            "stories\n"
          ]
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "Similar to how we fetched a dataset, we can fetch a table. In the code cell below, we fetch the `full` table in the hacker_news dataset"
      ],
      "metadata": {
        "id": "EOCsUwyEJA0K"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Construct a reference to the \"full\" table\n",
        "table_ref = dataset_ref.table(\"full\")\n",
        "\n",
        "# API request - fetch the table\n",
        "table = client.get_table(table_ref)"
      ],
      "metadata": {
        "id": "Q6JCjVCGI6ev",
        "execution": {
          "iopub.status.busy": "2022-04-23T03:34:26.126551Z",
          "iopub.execute_input": "2022-04-23T03:34:26.127256Z",
          "iopub.status.idle": "2022-04-23T03:34:26.318084Z",
          "shell.execute_reply.started": "2022-04-23T03:34:26.127219Z",
          "shell.execute_reply": "2022-04-23T03:34:26.317267Z"
        },
        "trusted": true
      },
      "execution_count": 9,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "In the next section, you'll explore the contents of this table in more detail.  For now, take the time to use the image below to consolidate what you've learned so far.\n",
        "\n",
        "![first_commands](https://i.imgur.com/biYqbUB.png)"
      ],
      "metadata": {
        "id": "EQ_-OP8cJtqo"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "### Table schema\n",
        "\n",
        "The structure of a table is called its **schema**.  **We need to understand a table's schema to effectively pull out the data we want.** \n",
        "\n",
        "In this example, we'll investigate the `full` table that we fetched above."
      ],
      "metadata": {
        "id": "RK2iKPr0J295"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Print information on all the columns in the \"full\" table in the \"hacker_news\" dataset\n",
        "table.schema"
      ],
      "metadata": {
        "id": "oAcsp296JMPg",
        "execution": {
          "iopub.status.busy": "2022-04-23T03:34:27.440643Z",
          "iopub.execute_input": "2022-04-23T03:34:27.440936Z",
          "iopub.status.idle": "2022-04-23T03:34:27.448446Z",
          "shell.execute_reply.started": "2022-04-23T03:34:27.440908Z",
          "shell.execute_reply": "2022-04-23T03:34:27.447878Z"
        },
        "trusted": true,
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "181ec15c-ad4d-472d-d147-ce134a101477"
      },
      "execution_count": 10,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "[SchemaField('title', 'STRING', 'NULLABLE', 'Story title', ()),\n",
              " SchemaField('url', 'STRING', 'NULLABLE', 'Story url', ()),\n",
              " SchemaField('text', 'STRING', 'NULLABLE', 'Story or comment text', ()),\n",
              " SchemaField('dead', 'BOOLEAN', 'NULLABLE', 'Is dead?', ()),\n",
              " SchemaField('by', 'STRING', 'NULLABLE', \"The username of the item's author.\", ()),\n",
              " SchemaField('score', 'INTEGER', 'NULLABLE', 'Story score', ()),\n",
              " SchemaField('time', 'INTEGER', 'NULLABLE', 'Unix time', ()),\n",
              " SchemaField('timestamp', 'TIMESTAMP', 'NULLABLE', 'Timestamp for the unix time', ()),\n",
              " SchemaField('type', 'STRING', 'NULLABLE', 'Type of details (comment, comment_ranking, poll, story, job, pollopt)', ()),\n",
              " SchemaField('id', 'INTEGER', 'NULLABLE', \"The item's unique id.\", ()),\n",
              " SchemaField('parent', 'INTEGER', 'NULLABLE', 'Parent comment ID', ()),\n",
              " SchemaField('descendants', 'INTEGER', 'NULLABLE', 'Number of story or poll descendants', ()),\n",
              " SchemaField('ranking', 'INTEGER', 'NULLABLE', 'Comment ranking', ()),\n",
              " SchemaField('deleted', 'BOOLEAN', 'NULLABLE', 'Is deleted?', ())]"
            ]
          },
          "metadata": {},
          "execution_count": 10
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "Each [`SchemaField`](https://googleapis.github.io/google-cloud-python/latest/bigquery/generated/google.cloud.bigquery.schema.SchemaField.html#google.cloud.bigquery.schema.SchemaField) tells us about a specific column (which we also refer to as a **field**). In order, the information is:\n",
        "\n",
        "* The **name** of the column\n",
        "* The **field type** (or datatype) in the column\n",
        "* The **mode** of the column (`'NULLABLE'` means that a column allows NULL values, and is the default)\n",
        "* A **description** of the data in that column\n",
        "\n",
        "For instance, the field has the SchemaField:\n",
        "\n",
        "`SchemaField('by', 'string', 'NULLABLE', \"The username of the item's author.\",())`\n",
        "\n",
        "This tells us:\n",
        "- the field (or column) is called `by`,\n",
        "- the data in this field is strings, \n",
        "- NULL values are allowed, and\n",
        "- it contains the usernames corresponding to each item's author.\n",
        "\n",
        "We can use the [`list_rows()`](https://googleapis.dev/python/bigquery/latest/generated/google.cloud.bigquery.client.Client.html?highlight=list_rows#google.cloud.bigquery.client.Client.list_rows) method to check just the first five lines of of the `full` table to make sure this is right.  This returns a BigQuery [`RowIterator`](https://googleapis.dev/python/bigquery/latest/generated/google.cloud.bigquery.table.RowIterator.html?highlight=rowiterator#google.cloud.bigquery.table.RowIterator) object that can quickly be converted to a pandas DataFrame with the [`to_dataframe()`](https://googleapis.dev/python/bigquery/latest/generated/google.cloud.bigquery.table.RowIterator.html?highlight=to_dataframe#google.cloud.bigquery.table.RowIterator.to_dataframe) method."
      ],
      "metadata": {
        "id": "Pm6u6FfGKA8v"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Preview the first five lines of the \"full\" table\n",
        "client.list_rows(table, max_results=5).to_dataframe()"
      ],
      "metadata": {
        "id": "0eFyTIvFJ7ML",
        "execution": {
          "iopub.status.busy": "2022-04-23T03:34:28.459739Z",
          "iopub.execute_input": "2022-04-23T03:34:28.460370Z",
          "iopub.status.idle": "2022-04-23T03:34:29.255110Z",
          "shell.execute_reply.started": "2022-04-23T03:34:28.460329Z",
          "shell.execute_reply": "2022-04-23T03:34:29.254606Z"
        },
        "trusted": true,
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 577
        },
        "outputId": "f6e80114-32ca-40e8-8624-87c482e9f96f"
      },
      "execution_count": 11,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "  title   url                                               text  dead  \\\n",
              "0  None  None  The corruption isn&#x27;t about hurting or hel...  None   \n",
              "1  None  None  I had the choice of a M1 or a Thinkpad (which ...  None   \n",
              "2  None  None  Having a phone with you that you keep turned o...  None   \n",
              "3  None  None  I expect I&#x27;ll get infected eventually. I&...  None   \n",
              "4  None  None  I am trying to find more information such as h...  None   \n",
              "\n",
              "             by score        time                 timestamp     type  \\\n",
              "0       azernik  None  1642311281 2022-01-16 05:34:41+00:00  comment   \n",
              "1       dopeboy  None  1642311284 2022-01-16 05:34:44+00:00  comment   \n",
              "2      dane-pgp  None  1642311221 2022-01-16 05:33:41+00:00  comment   \n",
              "3        simsla  None  1642311228 2022-01-16 05:33:48+00:00  comment   \n",
              "4  hamiltonians  None  1642311234 2022-01-16 05:33:54+00:00  comment   \n",
              "\n",
              "         id    parent descendants ranking deleted  \n",
              "0  29953698  29951182        None    None    None  \n",
              "1  29953699  29950651        None    None    None  \n",
              "2  29953694  29953567        None    None    None  \n",
              "3  29953695  29953363        None    None    None  \n",
              "4  29953696  29953072        None    None    None  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-9fded52d-f4f7-403e-a0f4-783b5acd0302\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>title</th>\n",
              "      <th>url</th>\n",
              "      <th>text</th>\n",
              "      <th>dead</th>\n",
              "      <th>by</th>\n",
              "      <th>score</th>\n",
              "      <th>time</th>\n",
              "      <th>timestamp</th>\n",
              "      <th>type</th>\n",
              "      <th>id</th>\n",
              "      <th>parent</th>\n",
              "      <th>descendants</th>\n",
              "      <th>ranking</th>\n",
              "      <th>deleted</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>The corruption isn&amp;#x27;t about hurting or hel...</td>\n",
              "      <td>None</td>\n",
              "      <td>azernik</td>\n",
              "      <td>None</td>\n",
              "      <td>1642311281</td>\n",
              "      <td>2022-01-16 05:34:41+00:00</td>\n",
              "      <td>comment</td>\n",
              "      <td>29953698</td>\n",
              "      <td>29951182</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>I had the choice of a M1 or a Thinkpad (which ...</td>\n",
              "      <td>None</td>\n",
              "      <td>dopeboy</td>\n",
              "      <td>None</td>\n",
              "      <td>1642311284</td>\n",
              "      <td>2022-01-16 05:34:44+00:00</td>\n",
              "      <td>comment</td>\n",
              "      <td>29953699</td>\n",
              "      <td>29950651</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>Having a phone with you that you keep turned o...</td>\n",
              "      <td>None</td>\n",
              "      <td>dane-pgp</td>\n",
              "      <td>None</td>\n",
              "      <td>1642311221</td>\n",
              "      <td>2022-01-16 05:33:41+00:00</td>\n",
              "      <td>comment</td>\n",
              "      <td>29953694</td>\n",
              "      <td>29953567</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>I expect I&amp;#x27;ll get infected eventually. I&amp;...</td>\n",
              "      <td>None</td>\n",
              "      <td>simsla</td>\n",
              "      <td>None</td>\n",
              "      <td>1642311228</td>\n",
              "      <td>2022-01-16 05:33:48+00:00</td>\n",
              "      <td>comment</td>\n",
              "      <td>29953695</td>\n",
              "      <td>29953363</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>I am trying to find more information such as h...</td>\n",
              "      <td>None</td>\n",
              "      <td>hamiltonians</td>\n",
              "      <td>None</td>\n",
              "      <td>1642311234</td>\n",
              "      <td>2022-01-16 05:33:54+00:00</td>\n",
              "      <td>comment</td>\n",
              "      <td>29953696</td>\n",
              "      <td>29953072</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-9fded52d-f4f7-403e-a0f4-783b5acd0302')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-9fded52d-f4f7-403e-a0f4-783b5acd0302 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-9fded52d-f4f7-403e-a0f4-783b5acd0302');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "application/vnd.google.colaboratory.module+javascript": "\n      import \"https://ssl.gstatic.com/colaboratory/data_table/f872b2c2305463fd/data_table.js\";\n\n      window.createDataTable({\n        data: [[{\n            'v': 0,\n            'f': \"0\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n\"The corruption isn&#x27;t about hurting or helping Tesla.<p>It&#x27;s about Musk using his personal power and influence over Tesla to deter (the employees of) regulatory agencies from investigating him like any other citizen.\",\n{\n            'v': null,\n            'f': \"null\",\n        },\n\"azernik\",\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': 1642311281,\n            'f': \"1642311281\",\n        },\n\"2022-01-16 05:34:41+00:00\",\n\"comment\",\n{\n            'v': 29953698,\n            'f': \"29953698\",\n        },\n{\n            'v': 29951182,\n            'f': \"29951182\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        }],\n [{\n            'v': 1,\n            'f': \"1\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n\"I had the choice of a M1 or a Thinkpad (which is what I&#x27;ve been using for the past decade). It was a very difficult decision - everything I hear about the M1 is incredible. I ended up getting a thinkpad because I really don&#x27;t like macOS. But I don&#x27;t know if I&#x27;ll make that same decision in a couple years.\",\n{\n            'v': null,\n            'f': \"null\",\n        },\n\"dopeboy\",\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': 1642311284,\n            'f': \"1642311284\",\n        },\n\"2022-01-16 05:34:44+00:00\",\n\"comment\",\n{\n            'v': 29953699,\n            'f': \"29953699\",\n        },\n{\n            'v': 29950651,\n            'f': \"29950651\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        }],\n [{\n            'v': 2,\n            'f': \"2\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n\"Having a phone with you that you keep turned off most of the time would be a good way to learn what sort of things you really need it for.  Obviously you wouldn&#x27;t be able to receive incoming calls, but how much do you use your phone to browse the web, or as a map?  If you can time-shift those desires (until you are at a computer) or satisfy them with other devices, then you&#x27;ll be able to go longer and longer without turning the phone on, but still have it for emergencies.<p>Personally I&#x27;d settle for a phone which doesn&#x27;t connect to the mobile network unless I&#x27;m dialling out, or only connects when I&#x27;m in certain locations at certain times.  Perhaps using a VoIP service and automatically connecting to specific trusted Wi-Fi networks would suffice for that use case, however I&#x27;ve often wondered if it would be possible for a mobile network provider to also operate an FM radio station, which would broadcast a pre-agreed code specific to one of their users whenever that user had an incoming call.  I don&#x27;t know how much battery it would drain for a phone to be constantly scanning FM radio data, though.\",\n{\n            'v': null,\n            'f': \"null\",\n        },\n\"dane-pgp\",\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': 1642311221,\n            'f': \"1642311221\",\n        },\n\"2022-01-16 05:33:41+00:00\",\n\"comment\",\n{\n            'v': 29953694,\n            'f': \"29953694\",\n        },\n{\n            'v': 29953567,\n            'f': \"29953567\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        }],\n [{\n            'v': 3,\n            'f': \"3\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n\"I expect I&#x27;ll get infected eventually. I&#x27;d prefer not to get infected at a point when the system is (about to be) overloaded.<p>Omicron can&#x27;t be meaningfully stopped, but it can be rate limited.\",\n{\n            'v': null,\n            'f': \"null\",\n        },\n\"simsla\",\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': 1642311228,\n            'f': \"1642311228\",\n        },\n\"2022-01-16 05:33:48+00:00\",\n\"comment\",\n{\n            'v': 29953695,\n            'f': \"29953695\",\n        },\n{\n            'v': 29953363,\n            'f': \"29953363\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        }],\n [{\n            'v': 4,\n            'f': \"4\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n\"I am trying to find more information such as how they obtain the livestream views or how they hack the twitter accounts.   Hundreds of articles have been written about this but no insight as to where the livestream views come from.  Are the livestream  from proxies or  some sort of browser hijack.<p>Also, almost everyone by now is aware of the scam given all the news coverage it over the past few years, so I don&#x27;t see the need to repeat myself again.<p>Google &quot;crypto giveaway scam YouTube&quot; no quotes for more info.\",\n{\n            'v': null,\n            'f': \"null\",\n        },\n\"hamiltonians\",\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': 1642311234,\n            'f': \"1642311234\",\n        },\n\"2022-01-16 05:33:54+00:00\",\n\"comment\",\n{\n            'v': 29953696,\n            'f': \"29953696\",\n        },\n{\n            'v': 29953072,\n            'f': \"29953072\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        }]],\n        columns: [[\"number\", \"index\"], [\"number\", \"title\"], [\"number\", \"url\"], [\"string\", \"text\"], [\"number\", \"dead\"], [\"string\", \"by\"], [\"number\", \"score\"], [\"number\", \"time\"], [\"string\", \"timestamp\"], [\"string\", \"type\"], [\"number\", \"id\"], [\"number\", \"parent\"], [\"number\", \"descendants\"], [\"number\", \"ranking\"], [\"number\", \"deleted\"]],\n        columnOptions: [{\"width\": \"1px\", \"className\": \"index_column\"}],\n        rowsPerPage: 25,\n        helpUrl: \"https://colab.research.google.com/notebooks/data_table.ipynb\",\n        suppressOutputScrolling: true,\n        minimumWidth: undefined,\n      });\n    "
          },
          "metadata": {},
          "execution_count": 11
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "The `list_rows()` method will also let us look at just the information in a specific column. If we want to see the first five entries in the `by` column, for example, we can do that!"
      ],
      "metadata": {
        "id": "tfGNKgW3KxWW"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Preview the first five entries in the \"by\" column of the \"full\" table\n",
        "client.list_rows(table, selected_fields=table.schema[4:5], max_results=5).to_dataframe()"
      ],
      "metadata": {
        "id": "ghYSN97rKc6f",
        "execution": {
          "iopub.status.busy": "2022-04-23T03:34:30.428333Z",
          "iopub.execute_input": "2022-04-23T03:34:30.428784Z",
          "iopub.status.idle": "2022-04-23T03:34:30.823825Z",
          "shell.execute_reply.started": "2022-04-23T03:34:30.428728Z",
          "shell.execute_reply": "2022-04-23T03:34:30.823228Z"
        },
        "trusted": true,
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 197
        },
        "outputId": "c829a51b-8abe-4421-97b8-696145d9562f"
      },
      "execution_count": 12,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "             by\n",
              "0       azernik\n",
              "1       dopeboy\n",
              "2      dane-pgp\n",
              "3        simsla\n",
              "4  hamiltonians"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-9b525332-4ace-4f44-9994-445c347eaf1b\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>by</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>azernik</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>dopeboy</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>dane-pgp</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>simsla</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>hamiltonians</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-9b525332-4ace-4f44-9994-445c347eaf1b')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-9b525332-4ace-4f44-9994-445c347eaf1b button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-9b525332-4ace-4f44-9994-445c347eaf1b');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "application/vnd.google.colaboratory.module+javascript": "\n      import \"https://ssl.gstatic.com/colaboratory/data_table/f872b2c2305463fd/data_table.js\";\n\n      window.createDataTable({\n        data: [[{\n            'v': 0,\n            'f': \"0\",\n        },\n\"azernik\"],\n [{\n            'v': 1,\n            'f': \"1\",\n        },\n\"dopeboy\"],\n [{\n            'v': 2,\n            'f': \"2\",\n        },\n\"dane-pgp\"],\n [{\n            'v': 3,\n            'f': \"3\",\n        },\n\"simsla\"],\n [{\n            'v': 4,\n            'f': \"4\",\n        },\n\"hamiltonians\"]],\n        columns: [[\"number\", \"index\"], [\"string\", \"by\"]],\n        columnOptions: [{\"width\": \"1px\", \"className\": \"index_column\"}],\n        rowsPerPage: 25,\n        helpUrl: \"https://colab.research.google.com/notebooks/data_table.ipynb\",\n        suppressOutputScrolling: true,\n        minimumWidth: undefined,\n      });\n    "
          },
          "metadata": {},
          "execution_count": 12
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "### Select, From & Where"
      ],
      "metadata": {
        "id": "83w8rxkkM5TE"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "Now that you know how to access and examine a dataset, you're ready to write your first SQL query!  As you'll soon see, **SQL queries will help you sort through a massive dataset, to retrieve only the information that you need.** We'll begin by using the keywords **SELECT**, **FROM**, and **WHERE** to get data from specific columns based on conditions you specify. "
      ],
      "metadata": {
        "id": "wrRPIX4PMsKK"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "We'll use an [OpenAQ](https://openaq.org) dataset about air quality. First, we'll set up everything we need to run queries and take a quick peek at what tables are in our database."
      ],
      "metadata": {
        "id": "yZwYMG_FNm5S"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Construct a reference to the \"openaq\" dataset\n",
        "dataset_ref = client.dataset(\"openaq\", project=\"bigquery-public-data\")\n",
        "\n",
        "# API request - fetch the dataset\n",
        "dataset = client.get_dataset(dataset_ref)\n",
        "\n",
        "# List all the tables in the \"openaq\" dataset\n",
        "tables = list(client.list_tables(dataset))\n",
        "\n",
        "# Print names of all tables in the dataset (there's only one!)\n",
        "for table in tables:  \n",
        "    print(table.table_id)"
      ],
      "metadata": {
        "id": "72Zhkk0wK3oN",
        "execution": {
          "iopub.status.busy": "2022-04-23T03:34:34.103065Z",
          "iopub.execute_input": "2022-04-23T03:34:34.103476Z",
          "iopub.status.idle": "2022-04-23T03:34:34.569656Z",
          "shell.execute_reply.started": "2022-04-23T03:34:34.103437Z",
          "shell.execute_reply": "2022-04-23T03:34:34.568998Z"
        },
        "trusted": true,
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "6ffdfcb1-d260-4312-f359-d18210d6ef58"
      },
      "execution_count": 13,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "global_air_quality\n"
          ]
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "The dataset contains only one table, called `global_air_quality`.  We'll fetch the table and take a peek at the first few rows to see what sort of data it contains."
      ],
      "metadata": {
        "id": "x1czVLqAOEcs"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Construct a reference to the \"global_air_quality\" table\n",
        "table_ref = dataset_ref.table(\"global_air_quality\")\n",
        "\n",
        "# API request - fetch the table\n",
        "table = client.get_table(table_ref)\n",
        "\n",
        "# Preview the first five lines of the \"global_air_quality\" table\n",
        "client.list_rows(table, max_results=5).to_dataframe()"
      ],
      "metadata": {
        "id": "kYAuivzjOAM9",
        "execution": {
          "iopub.status.busy": "2022-04-23T03:34:36.196047Z",
          "iopub.execute_input": "2022-04-23T03:34:36.196746Z",
          "iopub.status.idle": "2022-04-23T03:34:36.908751Z",
          "shell.execute_reply.started": "2022-04-23T03:34:36.196712Z",
          "shell.execute_reply": "2022-04-23T03:34:36.907911Z"
        },
        "trusted": true,
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 197
        },
        "outputId": "7ca3142f-93ab-42c5-82b5-13e8b15145fd"
      },
      "execution_count": 14,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "  location            city country pollutant  value                 timestamp  \\\n",
              "0       MA  Salt Lake City      US       no2  0.003 2020-01-06 18:00:00+00:00   \n",
              "1       MA  Salt Lake City      US       so2  0.001 2020-01-06 18:00:00+00:00   \n",
              "2       MA  Salt Lake City      US        o3  0.039 2020-01-06 18:00:00+00:00   \n",
              "3       MA  Salt Lake City      US      pm25  1.300 2020-01-06 18:00:00+00:00   \n",
              "4       NR  Salt Lake City      US       no2  0.013 2020-06-11 00:00:00+00:00   \n",
              "\n",
              "    unit source_name   latitude   longitude  averaged_over_in_hours  \\\n",
              "0    ppm      AirNow  40.712063 -112.111120                     1.0   \n",
              "1    ppm      AirNow  40.712063 -112.111120                     1.0   \n",
              "2    ppm      AirNow  40.712063 -112.111120                     1.0   \n",
              "3  µg/m³      AirNow  40.712063 -112.111120                     1.0   \n",
              "4    ppm      AirNow  40.662840 -111.901794                     1.0   \n",
              "\n",
              "                 location_geom  \n",
              "0  POINT(-112.11112 40.712063)  \n",
              "1  POINT(-112.11112 40.712063)  \n",
              "2  POINT(-112.11112 40.712063)  \n",
              "3  POINT(-112.11112 40.712063)  \n",
              "4  POINT(-111.901794 40.66284)  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-14d0c54d-074a-42ba-a25c-4afd2c7edf08\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>location</th>\n",
              "      <th>city</th>\n",
              "      <th>country</th>\n",
              "      <th>pollutant</th>\n",
              "      <th>value</th>\n",
              "      <th>timestamp</th>\n",
              "      <th>unit</th>\n",
              "      <th>source_name</th>\n",
              "      <th>latitude</th>\n",
              "      <th>longitude</th>\n",
              "      <th>averaged_over_in_hours</th>\n",
              "      <th>location_geom</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>MA</td>\n",
              "      <td>Salt Lake City</td>\n",
              "      <td>US</td>\n",
              "      <td>no2</td>\n",
              "      <td>0.003</td>\n",
              "      <td>2020-01-06 18:00:00+00:00</td>\n",
              "      <td>ppm</td>\n",
              "      <td>AirNow</td>\n",
              "      <td>40.712063</td>\n",
              "      <td>-112.111120</td>\n",
              "      <td>1.0</td>\n",
              "      <td>POINT(-112.11112 40.712063)</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>MA</td>\n",
              "      <td>Salt Lake City</td>\n",
              "      <td>US</td>\n",
              "      <td>so2</td>\n",
              "      <td>0.001</td>\n",
              "      <td>2020-01-06 18:00:00+00:00</td>\n",
              "      <td>ppm</td>\n",
              "      <td>AirNow</td>\n",
              "      <td>40.712063</td>\n",
              "      <td>-112.111120</td>\n",
              "      <td>1.0</td>\n",
              "      <td>POINT(-112.11112 40.712063)</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>MA</td>\n",
              "      <td>Salt Lake City</td>\n",
              "      <td>US</td>\n",
              "      <td>o3</td>\n",
              "      <td>0.039</td>\n",
              "      <td>2020-01-06 18:00:00+00:00</td>\n",
              "      <td>ppm</td>\n",
              "      <td>AirNow</td>\n",
              "      <td>40.712063</td>\n",
              "      <td>-112.111120</td>\n",
              "      <td>1.0</td>\n",
              "      <td>POINT(-112.11112 40.712063)</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>MA</td>\n",
              "      <td>Salt Lake City</td>\n",
              "      <td>US</td>\n",
              "      <td>pm25</td>\n",
              "      <td>1.300</td>\n",
              "      <td>2020-01-06 18:00:00+00:00</td>\n",
              "      <td>µg/m³</td>\n",
              "      <td>AirNow</td>\n",
              "      <td>40.712063</td>\n",
              "      <td>-112.111120</td>\n",
              "      <td>1.0</td>\n",
              "      <td>POINT(-112.11112 40.712063)</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>NR</td>\n",
              "      <td>Salt Lake City</td>\n",
              "      <td>US</td>\n",
              "      <td>no2</td>\n",
              "      <td>0.013</td>\n",
              "      <td>2020-06-11 00:00:00+00:00</td>\n",
              "      <td>ppm</td>\n",
              "      <td>AirNow</td>\n",
              "      <td>40.662840</td>\n",
              "      <td>-111.901794</td>\n",
              "      <td>1.0</td>\n",
              "      <td>POINT(-111.901794 40.66284)</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-14d0c54d-074a-42ba-a25c-4afd2c7edf08')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-14d0c54d-074a-42ba-a25c-4afd2c7edf08 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-14d0c54d-074a-42ba-a25c-4afd2c7edf08');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "application/vnd.google.colaboratory.module+javascript": "\n      import \"https://ssl.gstatic.com/colaboratory/data_table/f872b2c2305463fd/data_table.js\";\n\n      window.createDataTable({\n        data: [[{\n            'v': 0,\n            'f': \"0\",\n        },\n\"MA\",\n\"Salt Lake City\",\n\"US\",\n\"no2\",\n{\n            'v': 0.003,\n            'f': \"0.003\",\n        },\n\"2020-01-06 18:00:00+00:00\",\n\"ppm\",\n\"AirNow\",\n{\n            'v': 40.712063,\n            'f': \"40.712063\",\n        },\n{\n            'v': -112.11112,\n            'f': \"-112.11112\",\n        },\n{\n            'v': 1.0,\n            'f': \"1.0\",\n        },\n\"POINT(-112.11112 40.712063)\"],\n [{\n            'v': 1,\n            'f': \"1\",\n        },\n\"MA\",\n\"Salt Lake City\",\n\"US\",\n\"so2\",\n{\n            'v': 0.001,\n            'f': \"0.001\",\n        },\n\"2020-01-06 18:00:00+00:00\",\n\"ppm\",\n\"AirNow\",\n{\n            'v': 40.712063,\n            'f': \"40.712063\",\n        },\n{\n            'v': -112.11112,\n            'f': \"-112.11112\",\n        },\n{\n            'v': 1.0,\n            'f': \"1.0\",\n        },\n\"POINT(-112.11112 40.712063)\"],\n [{\n            'v': 2,\n            'f': \"2\",\n        },\n\"MA\",\n\"Salt Lake City\",\n\"US\",\n\"o3\",\n{\n            'v': 0.039,\n            'f': \"0.039\",\n        },\n\"2020-01-06 18:00:00+00:00\",\n\"ppm\",\n\"AirNow\",\n{\n            'v': 40.712063,\n            'f': \"40.712063\",\n        },\n{\n            'v': -112.11112,\n            'f': \"-112.11112\",\n        },\n{\n            'v': 1.0,\n            'f': \"1.0\",\n        },\n\"POINT(-112.11112 40.712063)\"],\n [{\n            'v': 3,\n            'f': \"3\",\n        },\n\"MA\",\n\"Salt Lake City\",\n\"US\",\n\"pm25\",\n{\n            'v': 1.3,\n            'f': \"1.3\",\n        },\n\"2020-01-06 18:00:00+00:00\",\n\"\\u00b5g/m\\u00b3\",\n\"AirNow\",\n{\n            'v': 40.712063,\n            'f': \"40.712063\",\n        },\n{\n            'v': -112.11112,\n            'f': \"-112.11112\",\n        },\n{\n            'v': 1.0,\n            'f': \"1.0\",\n        },\n\"POINT(-112.11112 40.712063)\"],\n [{\n            'v': 4,\n            'f': \"4\",\n        },\n\"NR\",\n\"Salt Lake City\",\n\"US\",\n\"no2\",\n{\n            'v': 0.013,\n            'f': \"0.013\",\n        },\n\"2020-06-11 00:00:00+00:00\",\n\"ppm\",\n\"AirNow\",\n{\n            'v': 40.66284,\n            'f': \"40.66284\",\n        },\n{\n            'v': -111.901794,\n            'f': \"-111.901794\",\n        },\n{\n            'v': 1.0,\n            'f': \"1.0\",\n        },\n\"POINT(-111.901794 40.66284)\"]],\n        columns: [[\"number\", \"index\"], [\"string\", \"location\"], [\"string\", \"city\"], [\"string\", \"country\"], [\"string\", \"pollutant\"], [\"number\", \"value\"], [\"string\", \"timestamp\"], [\"string\", \"unit\"], [\"string\", \"source_name\"], [\"number\", \"latitude\"], [\"number\", \"longitude\"], [\"number\", \"averaged_over_in_hours\"], [\"string\", \"location_geom\"]],\n        columnOptions: [{\"width\": \"1px\", \"className\": \"index_column\"}],\n        rowsPerPage: 25,\n        helpUrl: \"https://colab.research.google.com/notebooks/data_table.ipynb\",\n        suppressOutputScrolling: true,\n        minimumWidth: undefined,\n      });\n    "
          },
          "metadata": {},
          "execution_count": 14
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "let's put together a query. Say we want to select all the values from the `city` column that are in rows where the `country` column is `'US'` (for \"United States\")."
      ],
      "metadata": {
        "id": "1g2H5wNrOJrB"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Query to select all the items from the \"city\" column where the \"country\" column is 'US'\n",
        "# SQL is almost completely case and indentation insensitive. The capitalization and\n",
        "# indentation style here is preferred style.\n",
        "query = \"\"\"\n",
        "        SELECT city\n",
        "        FROM `bigquery-public-data.openaq.global_air_quality`\n",
        "        WHERE country = 'US'\n",
        "        \"\"\""
      ],
      "metadata": {
        "id": "HNVL36G6OI9z",
        "execution": {
          "iopub.status.busy": "2022-04-23T03:34:37.669408Z",
          "iopub.execute_input": "2022-04-23T03:34:37.669919Z",
          "iopub.status.idle": "2022-04-23T03:34:37.674122Z",
          "shell.execute_reply.started": "2022-04-23T03:34:37.669880Z",
          "shell.execute_reply": "2022-04-23T03:34:37.673374Z"
        },
        "trusted": true
      },
      "execution_count": 15,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Notice also that SQL statements requires single quotes for its strings inside python string (we use triple quotation mark here). We begin by setting up the query with the [`query()`](https://googleapis.dev/python/bigquery/latest/generated/google.cloud.bigquery.client.Client.html?highlight=query#google.cloud.bigquery.client.Client.query) method."
      ],
      "metadata": {
        "id": "lrUVBJqmOfgD"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Set up the query\n",
        "query_job = client.query(query)\n",
        "\n",
        "# API request - run the query, and return a pandas DataFrame\n",
        "us_cities = query_job.to_dataframe()"
      ],
      "metadata": {
        "id": "3P7iBjrQbxC5",
        "execution": {
          "iopub.status.busy": "2022-04-23T03:35:55.758822Z",
          "iopub.execute_input": "2022-04-23T03:35:55.759311Z",
          "iopub.status.idle": "2022-04-23T03:35:59.638985Z",
          "shell.execute_reply.started": "2022-04-23T03:35:55.759280Z",
          "shell.execute_reply": "2022-04-23T03:35:59.638236Z"
        },
        "trusted": true
      },
      "execution_count": 19,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Now we've got a pandas DataFrame called `us_cities`, which we can use like any other DataFrame."
      ],
      "metadata": {
        "id": "28V2b3IMderS"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# What five cities have the most measurements?\n",
        "us_cities.city.value_counts().head()"
      ],
      "metadata": {
        "id": "GBNcJ4NtdeAm",
        "execution": {
          "iopub.status.busy": "2022-04-23T03:36:03.918735Z",
          "iopub.execute_input": "2022-04-23T03:36:03.919522Z",
          "iopub.status.idle": "2022-04-23T03:36:03.937814Z",
          "shell.execute_reply.started": "2022-04-23T03:36:03.919480Z",
          "shell.execute_reply": "2022-04-23T03:36:03.936847Z"
        },
        "trusted": true,
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "703cd3cc-8759-4cdb-dba0-3db7a3d248b3"
      },
      "execution_count": 17,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "Phoenix-Mesa-Scottsdale                     2147\n",
              "Riverside-San Bernardino-Ontario            2138\n",
              "Los Angeles-Long Beach-Santa Ana            1656\n",
              "New York-Northern New Jersey-Long Island    1433\n",
              "San Francisco-Oakland-Fremont               1337\n",
              "Name: city, dtype: int64"
            ]
          },
          "metadata": {},
          "execution_count": 17
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "If you want multiple columns, you can select them with a comma between the names:"
      ],
      "metadata": {
        "id": "yDRyu1cR4fLU"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "query = \"\"\"\n",
        "        SELECT city, country\n",
        "        FROM `bigquery-public-data.openaq.global_air_quality`\n",
        "        WHERE country = 'US'\n",
        "        \"\"\""
      ],
      "metadata": {
        "execution": {
          "iopub.status.busy": "2022-04-23T03:40:51.487848Z",
          "iopub.execute_input": "2022-04-23T03:40:51.488176Z",
          "iopub.status.idle": "2022-04-23T03:40:51.492074Z",
          "shell.execute_reply.started": "2022-04-23T03:40:51.488143Z",
          "shell.execute_reply": "2022-04-23T03:40:51.491286Z"
        },
        "trusted": true,
        "id": "Ixf_qpmo4fLU"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "You can select all columns with a `*` like this:"
      ],
      "metadata": {
        "id": "3BNBGKPC4fLU"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "query = \"\"\"\n",
        "        SELECT *\n",
        "        FROM `bigquery-public-data.openaq.global_air_quality`\n",
        "        WHERE country = 'US'\n",
        "        \"\"\""
      ],
      "metadata": {
        "id": "LQ29UlAi4fLU"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "### Querying big dataset"
      ],
      "metadata": {
        "id": "w2hxfX8Q4fLV"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "You can estimate the size of any query before running it. Here is an example using the Hacker News dataset. To see how much data a query will scan, we create a [`QueryJobConfig`](https://googleapis.dev/python/bigquery/latest/generated/google.cloud.bigquery.job.QueryJobConfig.html?highlight=queryjobconfig#google.cloud.bigquery.job.QueryJobConfig) object and set the `dry_run` parameter to `True`."
      ],
      "metadata": {
        "id": "u0R_qYE64fLV"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Query to get the score column from every row where the type column has value \"job\"\n",
        "query = \"\"\"\n",
        "        SELECT score, title\n",
        "        FROM `bigquery-public-data.hacker_news.full`\n",
        "        WHERE type = \"job\" \n",
        "        \"\"\"\n",
        "\n",
        "# Create a QueryJobConfig object to estimate size of query without running it\n",
        "dry_run_config = bigquery.QueryJobConfig(dry_run=True)\n",
        "\n",
        "# API request - dry run query to estimate costs\n",
        "dry_run_query_job = client.query(query, job_config=dry_run_config)\n",
        "\n",
        "print(\"This query will process {} bytes.\".format(dry_run_query_job.total_bytes_processed))"
      ],
      "metadata": {
        "execution": {
          "iopub.status.busy": "2022-04-23T03:43:43.636879Z",
          "iopub.execute_input": "2022-04-23T03:43:43.637619Z",
          "iopub.status.idle": "2022-04-23T03:43:44.083153Z",
          "shell.execute_reply.started": "2022-04-23T03:43:43.637561Z",
          "shell.execute_reply": "2022-04-23T03:43:44.082302Z"
        },
        "trusted": true,
        "id": "wI2YIn_74fLV",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "1209729a-5595-4b2e-dd0e-59ba3c158f93"
      },
      "execution_count": 20,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "This query will process 520947826 bytes.\n"
          ]
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "You can also specify a parameter when running the query to limit how much data you are willing to scan. Here's an example with a low limit."
      ],
      "metadata": {
        "id": "nkAUYjlr4fLV"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Only run the query if it's less than 1 MB\n",
        "ONE_MB = 1000*1000\n",
        "safe_config = bigquery.QueryJobConfig(maximum_bytes_billed=ONE_MB)\n",
        "\n",
        "# Set up the query (will only run if it's less than 1 MB)\n",
        "safe_query_job = client.query(query, job_config=safe_config)\n",
        "\n",
        "# API request - try to run the query, and return a pandas DataFrame\n",
        "safe_query_job.to_dataframe()"
      ],
      "metadata": {
        "execution": {
          "iopub.status.busy": "2022-04-23T03:44:14.959744Z",
          "iopub.execute_input": "2022-04-23T03:44:14.960252Z",
          "iopub.status.idle": "2022-04-23T03:44:15.424113Z",
          "shell.execute_reply.started": "2022-04-23T03:44:14.960218Z",
          "shell.execute_reply": "2022-04-23T03:44:15.422855Z"
        },
        "trusted": true,
        "id": "u2DeZlki4fLV",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 556
        },
        "outputId": "3a0ce9e7-ea73-46f0-d116-4945422bec40"
      },
      "execution_count": 21,
      "outputs": [
        {
          "output_type": "error",
          "ename": "InternalServerError",
          "evalue": "ignored",
          "traceback": [
            "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
            "\u001b[0;31mInternalServerError\u001b[0m                       Traceback (most recent call last)",
            "\u001b[0;32m<ipython-input-21-3ffb41147547>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m      7\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      8\u001b[0m \u001b[0;31m# API request - try to run the query, and return a pandas DataFrame\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 9\u001b[0;31m \u001b[0msafe_query_job\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mto_dataframe\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
            "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/google/cloud/bigquery/job.py\u001b[0m in \u001b[0;36mto_dataframe\u001b[0;34m(self, bqstorage_client, dtypes, progress_bar_type)\u001b[0m\n\u001b[1;32m   3103\u001b[0m             \u001b[0mValueError\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mIf\u001b[0m \u001b[0mthe\u001b[0m\u001b[0;31m \u001b[0m\u001b[0;31m`\u001b[0m\u001b[0mpandas\u001b[0m\u001b[0;31m`\u001b[0m \u001b[0mlibrary\u001b[0m \u001b[0mcannot\u001b[0m \u001b[0mbe\u001b[0m \u001b[0mimported\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   3104\u001b[0m         \"\"\"\n\u001b[0;32m-> 3105\u001b[0;31m         return self.result().to_dataframe(\n\u001b[0m\u001b[1;32m   3106\u001b[0m             \u001b[0mbqstorage_client\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mbqstorage_client\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   3107\u001b[0m             \u001b[0mdtypes\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mdtypes\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
            "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/google/cloud/bigquery/job.py\u001b[0m in \u001b[0;36mresult\u001b[0;34m(self, timeout, page_size, retry, max_results)\u001b[0m\n\u001b[1;32m   2972\u001b[0m         \"\"\"\n\u001b[1;32m   2973\u001b[0m         \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 2974\u001b[0;31m             \u001b[0msuper\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mQueryJob\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mresult\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtimeout\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mtimeout\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m   2975\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   2976\u001b[0m             \u001b[0;31m# Return an iterator instead of returning the job.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
            "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/google/cloud/bigquery/job.py\u001b[0m in \u001b[0;36mresult\u001b[0;34m(self, timeout, retry)\u001b[0m\n\u001b[1;32m    766\u001b[0m             \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_begin\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mretry\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mretry\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    767\u001b[0m         \u001b[0;31m# TODO: modify PollingFuture so it can pass a retry argument to done().\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 768\u001b[0;31m         \u001b[0;32mreturn\u001b[0m \u001b[0msuper\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0m_AsyncJob\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mresult\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mtimeout\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mtimeout\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    769\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    770\u001b[0m     \u001b[0;32mdef\u001b[0m \u001b[0mcancelled\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
            "\u001b[0;32m/usr/local/lib/python3.7/dist-packages/google/api_core/future/polling.py\u001b[0m in \u001b[0;36mresult\u001b[0;34m(self, timeout, retry)\u001b[0m\n\u001b[1;32m    133\u001b[0m             \u001b[0;31m# pylint: disable=raising-bad-type\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    134\u001b[0m             \u001b[0;31m# Pylint doesn't recognize that this is valid in this case.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 135\u001b[0;31m             \u001b[0;32mraise\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_exception\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    136\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    137\u001b[0m         \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_result\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
            "\u001b[0;31mInternalServerError\u001b[0m: 500 Query exceeded limit for bytes billed: 1000000. 521142272 or higher required.\n\n(job ID: 9ca85173-7397-408a-b6fc-204ffe97d7b5)\n\n             -----Query Job SQL Follows-----             \n\n    |    .    |    .    |    .    |    .    |    .    |\n   1:\n   2:        SELECT score, title\n   3:        FROM `bigquery-public-data.hacker_news.full`\n   4:        WHERE type = \"job\" \n   5:        \n    |    .    |    .    |    .    |    .    |    .    |"
          ]
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "In this case, the query was cancelled, because the limit of 1 MB was exceeded.  However, we can also increase the limit to run the query successfully!"
      ],
      "metadata": {
        "id": "1Zs8R3_I4fLV"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "### Group By, Having & Count"
      ],
      "metadata": {
        "id": "-H9JRY8H6y6Q"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "Now that you can select raw data, you're ready to learn how to group your data and count things within those groups."
      ],
      "metadata": {
        "id": "vjHQ4EZe4fLV"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "The Hacker News dataset contains information on stories and comments from the Hacker News social networking site. We'll work with the `comments` table and begin by printing the first few rows"
      ],
      "metadata": {
        "id": "jlxSqT5I4fLW"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Construct a reference to the \"hacker_news\" dataset\n",
        "dataset_ref = client.dataset(\"hacker_news\", project=\"bigquery-public-data\")\n",
        "\n",
        "# API request - fetch the dataset\n",
        "dataset = client.get_dataset(dataset_ref)\n",
        "\n",
        "# Construct a reference to the \"comments\" table\n",
        "table_ref = dataset_ref.table(\"comments\")\n",
        "\n",
        "# API request - fetch the table\n",
        "table = client.get_table(table_ref)\n",
        "\n",
        "# Preview the first five lines of the \"comments\" table\n",
        "client.list_rows(table, max_results=5).to_dataframe()"
      ],
      "metadata": {
        "execution": {
          "iopub.status.busy": "2022-04-23T03:50:17.465447Z",
          "iopub.execute_input": "2022-04-23T03:50:17.465984Z",
          "iopub.status.idle": "2022-04-23T03:50:18.444486Z",
          "shell.execute_reply.started": "2022-04-23T03:50:17.465945Z",
          "shell.execute_reply": "2022-04-23T03:50:18.443937Z"
        },
        "trusted": true,
        "id": "lbhZ1WrH4fLW",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 346
        },
        "outputId": "c305c389-6ff7-4328-dca3-0e8731440f48"
      },
      "execution_count": 24,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         id  by author        time                   time_ts  \\\n",
              "0   2701393  5l     5l  1309184881 2011-06-27 14:28:01+00:00   \n",
              "1   5811403  99     99  1370234048 2013-06-03 04:34:08+00:00   \n",
              "2     21623  AF     AF  1178992400 2007-05-12 17:53:20+00:00   \n",
              "3  10159727  EA     EA  1441206574 2015-09-02 15:09:34+00:00   \n",
              "4   2988424  Iv     Iv  1315853580 2011-09-12 18:53:00+00:00   \n",
              "\n",
              "                                                text    parent deleted  dead  \\\n",
              "0  And the glazier who fixed all the broken windo...   2701243    None  None   \n",
              "1  Does canada have the equivalent of H1B/Green c...   5804452    None  None   \n",
              "2  Speaking of Rails, there are other options in ...     21611    None  None   \n",
              "3  Humans and large livestock (and maybe even pet...  10159396    None  None   \n",
              "4  I must say I reacted in the same way when I re...   2988179    None  None   \n",
              "\n",
              "   ranking  \n",
              "0        0  \n",
              "1        0  \n",
              "2        0  \n",
              "3        0  \n",
              "4        0  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-92e0a151-6765-49c8-9e8b-e8c4a5b041de\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>id</th>\n",
              "      <th>by</th>\n",
              "      <th>author</th>\n",
              "      <th>time</th>\n",
              "      <th>time_ts</th>\n",
              "      <th>text</th>\n",
              "      <th>parent</th>\n",
              "      <th>deleted</th>\n",
              "      <th>dead</th>\n",
              "      <th>ranking</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>2701393</td>\n",
              "      <td>5l</td>\n",
              "      <td>5l</td>\n",
              "      <td>1309184881</td>\n",
              "      <td>2011-06-27 14:28:01+00:00</td>\n",
              "      <td>And the glazier who fixed all the broken windo...</td>\n",
              "      <td>2701243</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>5811403</td>\n",
              "      <td>99</td>\n",
              "      <td>99</td>\n",
              "      <td>1370234048</td>\n",
              "      <td>2013-06-03 04:34:08+00:00</td>\n",
              "      <td>Does canada have the equivalent of H1B/Green c...</td>\n",
              "      <td>5804452</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>21623</td>\n",
              "      <td>AF</td>\n",
              "      <td>AF</td>\n",
              "      <td>1178992400</td>\n",
              "      <td>2007-05-12 17:53:20+00:00</td>\n",
              "      <td>Speaking of Rails, there are other options in ...</td>\n",
              "      <td>21611</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>10159727</td>\n",
              "      <td>EA</td>\n",
              "      <td>EA</td>\n",
              "      <td>1441206574</td>\n",
              "      <td>2015-09-02 15:09:34+00:00</td>\n",
              "      <td>Humans and large livestock (and maybe even pet...</td>\n",
              "      <td>10159396</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>2988424</td>\n",
              "      <td>Iv</td>\n",
              "      <td>Iv</td>\n",
              "      <td>1315853580</td>\n",
              "      <td>2011-09-12 18:53:00+00:00</td>\n",
              "      <td>I must say I reacted in the same way when I re...</td>\n",
              "      <td>2988179</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-92e0a151-6765-49c8-9e8b-e8c4a5b041de')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-92e0a151-6765-49c8-9e8b-e8c4a5b041de button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-92e0a151-6765-49c8-9e8b-e8c4a5b041de');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "application/vnd.google.colaboratory.module+javascript": "\n      import \"https://ssl.gstatic.com/colaboratory/data_table/f872b2c2305463fd/data_table.js\";\n\n      window.createDataTable({\n        data: [[{\n            'v': 0,\n            'f': \"0\",\n        },\n{\n            'v': 2701393,\n            'f': \"2701393\",\n        },\n\"5l\",\n\"5l\",\n{\n            'v': 1309184881,\n            'f': \"1309184881\",\n        },\n\"2011-06-27 14:28:01+00:00\",\n\"And the glazier who fixed all the broken windows also left his money to good causes.\",\n{\n            'v': 2701243,\n            'f': \"2701243\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': 0,\n            'f': \"0\",\n        }],\n [{\n            'v': 1,\n            'f': \"1\",\n        },\n{\n            'v': 5811403,\n            'f': \"5811403\",\n        },\n\"99\",\n\"99\",\n{\n            'v': 1370234048,\n            'f': \"1370234048\",\n        },\n\"2013-06-03 04:34:08+00:00\",\n\"Does canada have the equivalent of H1B/Green card for work sponsorship? What do you think of that?\",\n{\n            'v': 5804452,\n            'f': \"5804452\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': 0,\n            'f': \"0\",\n        }],\n [{\n            'v': 2,\n            'f': \"2\",\n        },\n{\n            'v': 21623,\n            'f': \"21623\",\n        },\n\"AF\",\n\"AF\",\n{\n            'v': 1178992400,\n            'f': \"1178992400\",\n        },\n\"2007-05-12 17:53:20+00:00\",\n\"Speaking of Rails, there are other options in the Python world besides Django.<p>Pylons is a very Rails-y framework with the difference being that it is made to be easy to customize. In Rails if you don't like something you are going to have a hard time changing it out unless you are a good hacker. In Pylons that is easy, and you've got access to Python's vastly better platform (speed, Unicode support) and libraries.<p>If you are an absolute beginning programmer it might be kind of hard to pick up, but if you've programmed a bit or you've used one or two web frameworks (especially Rails) Pylons won't be hard to learn.<p><a href=\\\"http://pylonshq.com/\\\" rel=\\\"nofollow\\\">http://pylonshq.com/<\\/a>\",\n{\n            'v': 21611,\n            'f': \"21611\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': 0,\n            'f': \"0\",\n        }],\n [{\n            'v': 3,\n            'f': \"3\",\n        },\n{\n            'v': 10159727,\n            'f': \"10159727\",\n        },\n\"EA\",\n\"EA\",\n{\n            'v': 1441206574,\n            'f': \"1441206574\",\n        },\n\"2015-09-02 15:09:34+00:00\",\n\"Humans and large livestock (and maybe even pets) will have health monitoring devices embedded into their bodies in the near future.  The devices will save the insurance companies money.  Savings on insurance premiums will be the incentive to encourage mass adoption by citizens and owners of livestock.\",\n{\n            'v': 10159396,\n            'f': \"10159396\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': 0,\n            'f': \"0\",\n        }],\n [{\n            'v': 4,\n            'f': \"4\",\n        },\n{\n            'v': 2988424,\n            'f': \"2988424\",\n        },\n\"Iv\",\n\"Iv\",\n{\n            'v': 1315853580,\n            'f': \"1315853580\",\n        },\n\"2011-09-12 18:53:00+00:00\",\n\"I must say I reacted in the same way when I read about Madoff. The fact that people who are supposed to inspect investments would fall for such a scheme was one of the first nails that was put in the esteem I had for economy specialists.\",\n{\n            'v': 2988179,\n            'f': \"2988179\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': 0,\n            'f': \"0\",\n        }]],\n        columns: [[\"number\", \"index\"], [\"number\", \"id\"], [\"string\", \"by\"], [\"string\", \"author\"], [\"number\", \"time\"], [\"string\", \"time_ts\"], [\"string\", \"text\"], [\"number\", \"parent\"], [\"number\", \"deleted\"], [\"number\", \"dead\"], [\"number\", \"ranking\"]],\n        columnOptions: [{\"width\": \"1px\", \"className\": \"index_column\"}],\n        rowsPerPage: 25,\n        helpUrl: \"https://colab.research.google.com/notebooks/data_table.ipynb\",\n        suppressOutputScrolling: true,\n        minimumWidth: undefined,\n      });\n    "
          },
          "metadata": {},
          "execution_count": 24
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "Let's use the table to see which comments generated the most replies.  Since:\n",
        "- the `parent` column indicates the comment that was replied to, and \n",
        "- the `id` column has the unique ID used to identify each comment, \n",
        "\n",
        "we can **GROUP BY** the `parent` column and **COUNT()** the `id` column in order to figure out the number of comments that were made as responses to a specific comment.\n",
        "\n",
        "Furthermore, since we're only interested in popular comments, we'll look at comments with more than ten replies.  So, we'll only return groups **HAVING** more than ten ID's."
      ],
      "metadata": {
        "id": "5xQZeTKP4fLW"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Query to select comments that received more than 10 replies\n",
        "query_popular = \"\"\"\n",
        "                SELECT parent, COUNT(id)\n",
        "                FROM `bigquery-public-data.hacker_news.comments`\n",
        "                GROUP BY parent\n",
        "                HAVING COUNT(id) > 10\n",
        "                \"\"\""
      ],
      "metadata": {
        "execution": {
          "iopub.status.busy": "2022-04-23T03:51:01.675659Z",
          "iopub.execute_input": "2022-04-23T03:51:01.675922Z",
          "iopub.status.idle": "2022-04-23T03:51:01.682999Z",
          "shell.execute_reply.started": "2022-04-23T03:51:01.675896Z",
          "shell.execute_reply": "2022-04-23T03:51:01.682129Z"
        },
        "trusted": true,
        "id": "GTGRV8GE4fLW"
      },
      "execution_count": 25,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "# Set up the query (cancel the query if it would use too much of \n",
        "# your quota, with the limit set to 10 GB)\n",
        "safe_config = bigquery.QueryJobConfig(maximum_bytes_billed=10**10)\n",
        "query_job = client.query(query_popular, job_config=safe_config)\n",
        "\n",
        "# API request - run the query, and convert the results to a pandas DataFrame\n",
        "popular_comments = query_job.to_dataframe()\n",
        "\n",
        "# Print the first five rows of the DataFrame\n",
        "popular_comments.head()"
      ],
      "metadata": {
        "execution": {
          "iopub.status.busy": "2022-04-23T03:51:25.840649Z",
          "iopub.execute_input": "2022-04-23T03:51:25.841060Z",
          "iopub.status.idle": "2022-04-23T03:51:32.497782Z",
          "shell.execute_reply.started": "2022-04-23T03:51:25.841031Z",
          "shell.execute_reply": "2022-04-23T03:51:32.496981Z"
        },
        "trusted": true,
        "id": "Wb7hgSV24fLW",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 197
        },
        "outputId": "99505207-0fd9-4195-ef19-61cee55ad4c1"
      },
      "execution_count": 29,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "    parent  f0_\n",
              "0  4332978   53\n",
              "1  2970550   63\n",
              "2  3353593   68\n",
              "3  3734303   56\n",
              "4  5048699   61"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-09c50ad0-5db5-474a-96e1-ecc3934ed749\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>parent</th>\n",
              "      <th>f0_</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>4332978</td>\n",
              "      <td>53</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>2970550</td>\n",
              "      <td>63</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>3353593</td>\n",
              "      <td>68</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>3734303</td>\n",
              "      <td>56</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>5048699</td>\n",
              "      <td>61</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-09c50ad0-5db5-474a-96e1-ecc3934ed749')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-09c50ad0-5db5-474a-96e1-ecc3934ed749 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-09c50ad0-5db5-474a-96e1-ecc3934ed749');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "application/vnd.google.colaboratory.module+javascript": "\n      import \"https://ssl.gstatic.com/colaboratory/data_table/f872b2c2305463fd/data_table.js\";\n\n      window.createDataTable({\n        data: [[{\n            'v': 0,\n            'f': \"0\",\n        },\n{\n            'v': 4332978,\n            'f': \"4332978\",\n        },\n{\n            'v': 53,\n            'f': \"53\",\n        }],\n [{\n            'v': 1,\n            'f': \"1\",\n        },\n{\n            'v': 2970550,\n            'f': \"2970550\",\n        },\n{\n            'v': 63,\n            'f': \"63\",\n        }],\n [{\n            'v': 2,\n            'f': \"2\",\n        },\n{\n            'v': 3353593,\n            'f': \"3353593\",\n        },\n{\n            'v': 68,\n            'f': \"68\",\n        }],\n [{\n            'v': 3,\n            'f': \"3\",\n        },\n{\n            'v': 3734303,\n            'f': \"3734303\",\n        },\n{\n            'v': 56,\n            'f': \"56\",\n        }],\n [{\n            'v': 4,\n            'f': \"4\",\n        },\n{\n            'v': 5048699,\n            'f': \"5048699\",\n        },\n{\n            'v': 61,\n            'f': \"61\",\n        }]],\n        columns: [[\"number\", \"index\"], [\"number\", \"parent\"], [\"number\", \"f0_\"]],\n        columnOptions: [{\"width\": \"1px\", \"className\": \"index_column\"}],\n        rowsPerPage: 25,\n        helpUrl: \"https://colab.research.google.com/notebooks/data_table.ipynb\",\n        suppressOutputScrolling: true,\n        minimumWidth: undefined,\n      });\n    "
          },
          "metadata": {},
          "execution_count": 29
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "popular_comments"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 441
        },
        "id": "TVI78T52tigN",
        "outputId": "5d33a14c-5412-4678-a52a-7f6f863d07f7"
      },
      "execution_count": 30,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Warning: total number of rows (77368) exceeds max_rows (20000). Falling back to pandas display.\n"
          ]
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         parent  f0_\n",
              "0       4332978   53\n",
              "1       2970550   63\n",
              "2       3353593   68\n",
              "3       3734303   56\n",
              "4       5048699   61\n",
              "...         ...  ...\n",
              "77363   1659020   37\n",
              "77364  10180728   37\n",
              "77365   9751539   37\n",
              "77366   7978163   37\n",
              "77367   4417571   37\n",
              "\n",
              "[77368 rows x 2 columns]"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-c6f96571-9ec5-4840-9754-8a61a5e8874a\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>parent</th>\n",
              "      <th>f0_</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>4332978</td>\n",
              "      <td>53</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>2970550</td>\n",
              "      <td>63</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>3353593</td>\n",
              "      <td>68</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>3734303</td>\n",
              "      <td>56</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>5048699</td>\n",
              "      <td>61</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>...</th>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>77363</th>\n",
              "      <td>1659020</td>\n",
              "      <td>37</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>77364</th>\n",
              "      <td>10180728</td>\n",
              "      <td>37</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>77365</th>\n",
              "      <td>9751539</td>\n",
              "      <td>37</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>77366</th>\n",
              "      <td>7978163</td>\n",
              "      <td>37</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>77367</th>\n",
              "      <td>4417571</td>\n",
              "      <td>37</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>77368 rows × 2 columns</p>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-c6f96571-9ec5-4840-9754-8a61a5e8874a')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-c6f96571-9ec5-4840-9754-8a61a5e8874a button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-c6f96571-9ec5-4840-9754-8a61a5e8874a');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 30
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "Each row in the `popular_comments` DataFrame corresponds to a comment that received more than ten replies.  For instance, the comment with ID `4332978` received `53` replies."
      ],
      "metadata": {
        "id": "j1eXKXzN4fLX"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "A couple hints to make your queries even better:\n",
        "- The column resulting from `COUNT(id)` was called `f0__`. That's not a very descriptive name. You can change the name by adding `AS NumPosts` after you specify the aggregation. This is called **aliasing**.\n",
        "- If you are ever unsure what to put inside the **COUNT()** function, you can do `COUNT(1)` to count the rows in each group. Most people find it especially readable, because we know it's not focusing on other columns. It also scans less data than if supplied column names (making it faster and using less of your data access quota).\n",
        "\n",
        "Using these tricks, we can rewrite our query:"
      ],
      "metadata": {
        "id": "UVvf6W2c9LbB"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Improved version of earlier query, now with aliasing & improved readability\n",
        "query_improved = \"\"\"\n",
        "                 SELECT parent, COUNT(1) AS NumPosts\n",
        "                 FROM `bigquery-public-data.hacker_news.comments`\n",
        "                 GROUP BY parent\n",
        "                 HAVING COUNT(1) > 10\n",
        "                 \"\"\"\n",
        "\n",
        "safe_config = bigquery.QueryJobConfig(maximum_bytes_billed=10**10)\n",
        "query_job = client.query(query_improved, job_config=safe_config)\n",
        "\n",
        "# API request - run the query, and convert the results to a pandas DataFrame\n",
        "improved_df = query_job.to_dataframe()\n",
        "\n",
        "# Print the first five rows of the DataFrame\n",
        "improved_df.head()"
      ],
      "metadata": {
        "execution": {
          "iopub.status.busy": "2022-04-23T03:53:18.501704Z",
          "iopub.execute_input": "2022-04-23T03:53:18.501994Z",
          "iopub.status.idle": "2022-04-23T03:53:24.596965Z",
          "shell.execute_reply.started": "2022-04-23T03:53:18.501962Z",
          "shell.execute_reply": "2022-04-23T03:53:24.596042Z"
        },
        "trusted": true,
        "id": "YCqkWvK_4fLX",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 197
        },
        "outputId": "90487304-d2e2-477f-de7e-26dd77fba79b"
      },
      "execution_count": 27,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "    parent  NumPosts\n",
              "0  6683866        39\n",
              "1  6627329        46\n",
              "2  3476843        49\n",
              "3  7234010        48\n",
              "4  2932956        76"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-5c66fc6d-e449-4ea4-95eb-82e412c32ceb\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>parent</th>\n",
              "      <th>NumPosts</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>6683866</td>\n",
              "      <td>39</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>6627329</td>\n",
              "      <td>46</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>3476843</td>\n",
              "      <td>49</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>7234010</td>\n",
              "      <td>48</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>2932956</td>\n",
              "      <td>76</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-5c66fc6d-e449-4ea4-95eb-82e412c32ceb')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-5c66fc6d-e449-4ea4-95eb-82e412c32ceb button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-5c66fc6d-e449-4ea4-95eb-82e412c32ceb');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "application/vnd.google.colaboratory.module+javascript": "\n      import \"https://ssl.gstatic.com/colaboratory/data_table/f872b2c2305463fd/data_table.js\";\n\n      window.createDataTable({\n        data: [[{\n            'v': 0,\n            'f': \"0\",\n        },\n{\n            'v': 6683866,\n            'f': \"6683866\",\n        },\n{\n            'v': 39,\n            'f': \"39\",\n        }],\n [{\n            'v': 1,\n            'f': \"1\",\n        },\n{\n            'v': 6627329,\n            'f': \"6627329\",\n        },\n{\n            'v': 46,\n            'f': \"46\",\n        }],\n [{\n            'v': 2,\n            'f': \"2\",\n        },\n{\n            'v': 3476843,\n            'f': \"3476843\",\n        },\n{\n            'v': 49,\n            'f': \"49\",\n        }],\n [{\n            'v': 3,\n            'f': \"3\",\n        },\n{\n            'v': 7234010,\n            'f': \"7234010\",\n        },\n{\n            'v': 48,\n            'f': \"48\",\n        }],\n [{\n            'v': 4,\n            'f': \"4\",\n        },\n{\n            'v': 2932956,\n            'f': \"2932956\",\n        },\n{\n            'v': 76,\n            'f': \"76\",\n        }]],\n        columns: [[\"number\", \"index\"], [\"number\", \"parent\"], [\"number\", \"NumPosts\"]],\n        columnOptions: [{\"width\": \"1px\", \"className\": \"index_column\"}],\n        rowsPerPage: 25,\n        helpUrl: \"https://colab.research.google.com/notebooks/data_table.ipynb\",\n        suppressOutputScrolling: true,\n        minimumWidth: undefined,\n      });\n    "
          },
          "metadata": {},
          "execution_count": 27
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "improved_df"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 441
        },
        "id": "e4yVjvH2tcBf",
        "outputId": "a32bf803-f8ba-4e73-c4d5-7d34be1e4266"
      },
      "execution_count": 28,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Warning: total number of rows (77368) exceeds max_rows (20000). Falling back to pandas display.\n"
          ]
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "        parent  NumPosts\n",
              "0      6683866        39\n",
              "1      6627329        46\n",
              "2      3476843        49\n",
              "3      7234010        48\n",
              "4      2932956        76\n",
              "...        ...       ...\n",
              "77363  2873865        37\n",
              "77364  6971290        37\n",
              "77365  8793579        37\n",
              "77366  6937686        37\n",
              "77367   412772        37\n",
              "\n",
              "[77368 rows x 2 columns]"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-6a09cc50-c130-405a-9683-d08628740839\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>parent</th>\n",
              "      <th>NumPosts</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>6683866</td>\n",
              "      <td>39</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>6627329</td>\n",
              "      <td>46</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>3476843</td>\n",
              "      <td>49</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>7234010</td>\n",
              "      <td>48</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>2932956</td>\n",
              "      <td>76</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>...</th>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>77363</th>\n",
              "      <td>2873865</td>\n",
              "      <td>37</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>77364</th>\n",
              "      <td>6971290</td>\n",
              "      <td>37</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>77365</th>\n",
              "      <td>8793579</td>\n",
              "      <td>37</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>77366</th>\n",
              "      <td>6937686</td>\n",
              "      <td>37</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>77367</th>\n",
              "      <td>412772</td>\n",
              "      <td>37</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>77368 rows × 2 columns</p>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-6a09cc50-c130-405a-9683-d08628740839')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-6a09cc50-c130-405a-9683-d08628740839 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-6a09cc50-c130-405a-9683-d08628740839');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 28
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "Now you have the data you want, and it has descriptive names. \n",
        "\n",
        "#### Note on using **GROUP BY**\n",
        "\n",
        "Note that because it tells SQL how to apply aggregate functions (like **COUNT()**), it doesn't make sense to use **GROUP BY** without an aggregate function.  Similarly, if you have any **GROUP BY** clause, then all variables must be passed to either a\n",
        "1. **GROUP BY** command, or\n",
        "2. an aggregation function.\n",
        "\n",
        "Consider the query below:\n",
        "\n"
      ],
      "metadata": {
        "id": "hNf8IMK_4fLX"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "query_good = \"\"\"\n",
        "             SELECT parent, COUNT(id)\n",
        "             FROM `bigquery-public-data.hacker_news.comments`\n",
        "             GROUP BY parent\n",
        "             \"\"\""
      ],
      "metadata": {
        "id": "dTzcM2C04fLX"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "Note that there are two variables: `parent` and `id`. \n",
        "- `parent` was passed to a **GROUP BY** command (in `GROUP BY parent`), and \n",
        "- `id` was passed to an aggregate function (in `COUNT(id)`).\n",
        "\n",
        "And this query won't work, because the `author` column isn't passed to an aggregate function or a **GROUP BY** clause:"
      ],
      "metadata": {
        "id": "oOpimGND4fLX"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "query_bad = \"\"\"\n",
        "            SELECT author, parent, COUNT(id)\n",
        "            FROM `bigquery-public-data.hacker_news.comments`\n",
        "            GROUP BY parent\n",
        "            \"\"\""
      ],
      "metadata": {
        "id": "57X5jB1J4fLX"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "### Order By"
      ],
      "metadata": {
        "id": "RYy0qPJ89w6r"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "Frequently, you’ll want to sort your results. **Let's use the US Traffic Fatality Records database, which contains information on traffic accidents in the US where at least one person died.**\n",
        "\n",
        "We'll investigate the `accident_2015` table. Here is a view of the first few rows. "
      ],
      "metadata": {
        "id": "T7euCJRKBCxu"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Construct a reference to the \"nhtsa_traffic_fatalities\" dataset\n",
        "dataset_ref = client.dataset(\"nhtsa_traffic_fatalities\", project=\"bigquery-public-data\")\n",
        "\n",
        "# API request - fetch the dataset\n",
        "dataset = client.get_dataset(dataset_ref)\n",
        "\n",
        "# Construct a reference to the \"accident_2015\" table\n",
        "table_ref = dataset_ref.table(\"accident_2015\")\n",
        "\n",
        "# API request - fetch the table\n",
        "table = client.get_table(table_ref)\n",
        "\n",
        "# Preview the first five lines of the \"accident_2015\" table\n",
        "client.list_rows(table, max_results=5).to_dataframe()"
      ],
      "metadata": {
        "id": "JpqOyftp9wVT",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 404
        },
        "outputId": "3004bd39-00c3-47ec-9431-134479707ad7"
      },
      "execution_count": 31,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Warning: Total number of columns (70) exceeds max_columns (20). Falling back to pandas display.\n"
          ]
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "   state_number state_name  consecutive_number  \\\n",
              "0            19       Iowa              190204   \n",
              "1            19       Iowa              190233   \n",
              "2            19       Iowa              190179   \n",
              "3            19       Iowa              190248   \n",
              "4            19       Iowa              190231   \n",
              "\n",
              "   number_of_vehicle_forms_submitted_all  \\\n",
              "0                                      1   \n",
              "1                                      1   \n",
              "2                                      1   \n",
              "3                                      1   \n",
              "4                                      1   \n",
              "\n",
              "   number_of_motor_vehicles_in_transport_mvit  \\\n",
              "0                                           1   \n",
              "1                                           1   \n",
              "2                                           1   \n",
              "3                                           1   \n",
              "4                                           1   \n",
              "\n",
              "   number_of_parked_working_vehicles  \\\n",
              "0                                  0   \n",
              "1                                  0   \n",
              "2                                  0   \n",
              "3                                  0   \n",
              "4                                  0   \n",
              "\n",
              "   number_of_forms_submitted_for_persons_not_in_motor_vehicles  \\\n",
              "0                                                  0             \n",
              "1                                                  0             \n",
              "2                                                  0             \n",
              "3                                                  0             \n",
              "4                                                  0             \n",
              "\n",
              "   number_of_persons_not_in_motor_vehicles_in_transport_mvit  \\\n",
              "0                                                  0           \n",
              "1                                                  0           \n",
              "2                                                  0           \n",
              "3                                                  0           \n",
              "4                                                  0           \n",
              "\n",
              "   number_of_persons_in_motor_vehicles_in_transport_mvit  \\\n",
              "0                                                  1       \n",
              "1                                                  1       \n",
              "2                                                  2       \n",
              "3                                                  4       \n",
              "4                                                  1       \n",
              "\n",
              "   number_of_forms_submitted_for_persons_in_motor_vehicles  ...  \\\n",
              "0                                                  1        ...   \n",
              "1                                                  1        ...   \n",
              "2                                                  2        ...   \n",
              "3                                                  4        ...   \n",
              "4                                                  1        ...   \n",
              "\n",
              "   minute_of_ems_arrival_at_hospital  related_factors_crash_level_1  \\\n",
              "0                                  2                              0   \n",
              "1                                 88                              0   \n",
              "2                                  1                              0   \n",
              "3                                 99                              0   \n",
              "4                                 88                              0   \n",
              "\n",
              "   related_factors_crash_level_1_name  related_factors_crash_level_2  \\\n",
              "0                                None                              0   \n",
              "1                                None                              0   \n",
              "2                                None                              0   \n",
              "3                                None                              0   \n",
              "4                                None                              0   \n",
              "\n",
              "   related_factors_crash_level_2_name  related_factors_crash_level_3  \\\n",
              "0                                None                              0   \n",
              "1                                None                              0   \n",
              "2                                None                              0   \n",
              "3                                None                              0   \n",
              "4                                None                              0   \n",
              "\n",
              "   related_factors_crash_level_3_name  number_of_fatalities  \\\n",
              "0                                None                     1   \n",
              "1                                None                     1   \n",
              "2                                None                     1   \n",
              "3                                None                     2   \n",
              "4                                None                     1   \n",
              "\n",
              "   number_of_drunk_drivers        timestamp_of_crash  \n",
              "0                        1 2015-09-11 20:20:00+00:00  \n",
              "1                        1 2015-11-01 00:30:00+00:00  \n",
              "2                        0 2015-05-04 16:18:00+00:00  \n",
              "3                        0 2015-11-17 12:26:00+00:00  \n",
              "4                        0 2015-10-31 04:49:00+00:00  \n",
              "\n",
              "[5 rows x 70 columns]"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-1aec5fb6-901e-4b49-8beb-a9e8337feda0\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>state_number</th>\n",
              "      <th>state_name</th>\n",
              "      <th>consecutive_number</th>\n",
              "      <th>number_of_vehicle_forms_submitted_all</th>\n",
              "      <th>number_of_motor_vehicles_in_transport_mvit</th>\n",
              "      <th>number_of_parked_working_vehicles</th>\n",
              "      <th>number_of_forms_submitted_for_persons_not_in_motor_vehicles</th>\n",
              "      <th>number_of_persons_not_in_motor_vehicles_in_transport_mvit</th>\n",
              "      <th>number_of_persons_in_motor_vehicles_in_transport_mvit</th>\n",
              "      <th>number_of_forms_submitted_for_persons_in_motor_vehicles</th>\n",
              "      <th>...</th>\n",
              "      <th>minute_of_ems_arrival_at_hospital</th>\n",
              "      <th>related_factors_crash_level_1</th>\n",
              "      <th>related_factors_crash_level_1_name</th>\n",
              "      <th>related_factors_crash_level_2</th>\n",
              "      <th>related_factors_crash_level_2_name</th>\n",
              "      <th>related_factors_crash_level_3</th>\n",
              "      <th>related_factors_crash_level_3_name</th>\n",
              "      <th>number_of_fatalities</th>\n",
              "      <th>number_of_drunk_drivers</th>\n",
              "      <th>timestamp_of_crash</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>19</td>\n",
              "      <td>Iowa</td>\n",
              "      <td>190204</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>...</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>None</td>\n",
              "      <td>0</td>\n",
              "      <td>None</td>\n",
              "      <td>0</td>\n",
              "      <td>None</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>2015-09-11 20:20:00+00:00</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>19</td>\n",
              "      <td>Iowa</td>\n",
              "      <td>190233</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>...</td>\n",
              "      <td>88</td>\n",
              "      <td>0</td>\n",
              "      <td>None</td>\n",
              "      <td>0</td>\n",
              "      <td>None</td>\n",
              "      <td>0</td>\n",
              "      <td>None</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>2015-11-01 00:30:00+00:00</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>19</td>\n",
              "      <td>Iowa</td>\n",
              "      <td>190179</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>2</td>\n",
              "      <td>...</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>None</td>\n",
              "      <td>0</td>\n",
              "      <td>None</td>\n",
              "      <td>0</td>\n",
              "      <td>None</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>2015-05-04 16:18:00+00:00</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>19</td>\n",
              "      <td>Iowa</td>\n",
              "      <td>190248</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>4</td>\n",
              "      <td>4</td>\n",
              "      <td>...</td>\n",
              "      <td>99</td>\n",
              "      <td>0</td>\n",
              "      <td>None</td>\n",
              "      <td>0</td>\n",
              "      <td>None</td>\n",
              "      <td>0</td>\n",
              "      <td>None</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>2015-11-17 12:26:00+00:00</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>19</td>\n",
              "      <td>Iowa</td>\n",
              "      <td>190231</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>...</td>\n",
              "      <td>88</td>\n",
              "      <td>0</td>\n",
              "      <td>None</td>\n",
              "      <td>0</td>\n",
              "      <td>None</td>\n",
              "      <td>0</td>\n",
              "      <td>None</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>2015-10-31 04:49:00+00:00</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>5 rows × 70 columns</p>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-1aec5fb6-901e-4b49-8beb-a9e8337feda0')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-1aec5fb6-901e-4b49-8beb-a9e8337feda0 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-1aec5fb6-901e-4b49-8beb-a9e8337feda0');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 31
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "Let's use the table to determine how the number of accidents varies with the day of the week.  Since:\n",
        "- the `consecutive_number` column contains a unique ID for each accident, and\n",
        "- the `timestamp_of_crash` column contains the date of the accident in [DATETIME](https://cloud.google.com/bigquery/docs/reference/standard-sql/date_functions) format,\n",
        "\n",
        "we can:\n",
        "- **EXTRACT** the day of the week (as `day_of_week` in the query below) from the `timestamp_of_crash` column, and\n",
        "- **GROUP BY** the day of the week, before we **COUNT** the `consecutive_number` column to determine the number of accidents for each day of the week.\n",
        "\n",
        "Then we sort the table with an **ORDER BY** clause, so the days with the most accidents are returned first."
      ],
      "metadata": {
        "id": "UUJ2fuKmBSc1"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Query to find out the number of accidents for each day of the week\n",
        "query = \"\"\"\n",
        "        SELECT COUNT(consecutive_number) AS num_accidents, \n",
        "               EXTRACT(DAYOFWEEK FROM timestamp_of_crash) AS day_of_week\n",
        "        FROM `bigquery-public-data.nhtsa_traffic_fatalities.accident_2015`\n",
        "        GROUP BY day_of_week\n",
        "        ORDER BY num_accidents DESC\n",
        "        \"\"\""
      ],
      "metadata": {
        "id": "4I_8JldMBc04"
      },
      "execution_count": 32,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "# Set up the query (cancel the query if it would use too much of \n",
        "# your quota, with the limit set to 1 GB)\n",
        "safe_config = bigquery.QueryJobConfig(maximum_bytes_billed=10**9)\n",
        "query_job = client.query(query, job_config=safe_config)\n",
        "\n",
        "# API request - run the query, and convert the results to a pandas DataFrame\n",
        "accidents_by_day = query_job.to_dataframe()\n",
        "\n",
        "# Print the DataFrame\n",
        "accidents_by_day"
      ],
      "metadata": {
        "id": "B6Dks4_hBrLv",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 240
        },
        "outputId": "6e6a401e-5e0d-4e45-81a9-478b6b676f6a"
      },
      "execution_count": 33,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "   num_accidents  day_of_week\n",
              "0           5659            7\n",
              "1           5298            1\n",
              "2           4916            6\n",
              "3           4460            5\n",
              "4           4182            4\n",
              "5           4038            2\n",
              "6           3985            3"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-7dacb67e-6a5f-484c-9e8a-17f81ed8e1e5\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>num_accidents</th>\n",
              "      <th>day_of_week</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>5659</td>\n",
              "      <td>7</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>5298</td>\n",
              "      <td>1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>4916</td>\n",
              "      <td>6</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>4460</td>\n",
              "      <td>5</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>4182</td>\n",
              "      <td>4</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>5</th>\n",
              "      <td>4038</td>\n",
              "      <td>2</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>6</th>\n",
              "      <td>3985</td>\n",
              "      <td>3</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-7dacb67e-6a5f-484c-9e8a-17f81ed8e1e5')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-7dacb67e-6a5f-484c-9e8a-17f81ed8e1e5 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-7dacb67e-6a5f-484c-9e8a-17f81ed8e1e5');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "application/vnd.google.colaboratory.module+javascript": "\n      import \"https://ssl.gstatic.com/colaboratory/data_table/f872b2c2305463fd/data_table.js\";\n\n      window.createDataTable({\n        data: [[{\n            'v': 0,\n            'f': \"0\",\n        },\n{\n            'v': 5659,\n            'f': \"5659\",\n        },\n{\n            'v': 7,\n            'f': \"7\",\n        }],\n [{\n            'v': 1,\n            'f': \"1\",\n        },\n{\n            'v': 5298,\n            'f': \"5298\",\n        },\n{\n            'v': 1,\n            'f': \"1\",\n        }],\n [{\n            'v': 2,\n            'f': \"2\",\n        },\n{\n            'v': 4916,\n            'f': \"4916\",\n        },\n{\n            'v': 6,\n            'f': \"6\",\n        }],\n [{\n            'v': 3,\n            'f': \"3\",\n        },\n{\n            'v': 4460,\n            'f': \"4460\",\n        },\n{\n            'v': 5,\n            'f': \"5\",\n        }],\n [{\n            'v': 4,\n            'f': \"4\",\n        },\n{\n            'v': 4182,\n            'f': \"4182\",\n        },\n{\n            'v': 4,\n            'f': \"4\",\n        }],\n [{\n            'v': 5,\n            'f': \"5\",\n        },\n{\n            'v': 4038,\n            'f': \"4038\",\n        },\n{\n            'v': 2,\n            'f': \"2\",\n        }],\n [{\n            'v': 6,\n            'f': \"6\",\n        },\n{\n            'v': 3985,\n            'f': \"3985\",\n        },\n{\n            'v': 3,\n            'f': \"3\",\n        }]],\n        columns: [[\"number\", \"index\"], [\"number\", \"num_accidents\"], [\"number\", \"day_of_week\"]],\n        columnOptions: [{\"width\": \"1px\", \"className\": \"index_column\"}],\n        rowsPerPage: 25,\n        helpUrl: \"https://colab.research.google.com/notebooks/data_table.ipynb\",\n        suppressOutputScrolling: true,\n        minimumWidth: undefined,\n      });\n    "
          },
          "metadata": {},
          "execution_count": 33
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "Notice that the data is sorted by the `num_accidents` column, where the days with more traffic accidents appear first.\n",
        "\n",
        "To map the numbers returned for the `day_of_week` column to the actual day, you might consult [the BigQuery documentation](https://cloud.google.com/bigquery/docs/reference/standard-sql/date_functions) on the DAYOFWEEK function. It says that it returns \"an integer between 1 (Sunday) and 7 (Saturday), inclusively\". So, in 2015, most fatal motor accidents in the US occured on Sunday and Saturday, while the fewest happened on Tuesday."
      ],
      "metadata": {
        "id": "vQ6usjVXBt4z"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "### As and With"
      ],
      "metadata": {
        "id": "gC3QTzVCI6ST"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "On its own, `AS` is a convenient way to clean up the data returned by your query. **We're going to use a common table expression (CTE)** to find out **how many Bitcoin transactions were made each day for the entire timespan of a bitcoin transaction dataset.**\n",
        "\n",
        "We'll investigate the transactions table. Here is a view of the first few rows."
      ],
      "metadata": {
        "id": "gsAQF0yII_2j"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Construct a reference to the \"crypto_bitcoin\" dataset\n",
        "dataset_ref = client.dataset(\"crypto_bitcoin\", project=\"bigquery-public-data\")\n",
        "\n",
        "# API request - fetch the dataset\n",
        "dataset = client.get_dataset(dataset_ref)\n",
        "\n",
        "# Construct a reference to the \"transactions\" table\n",
        "table_ref = dataset_ref.table(\"transactions\")\n",
        "\n",
        "# API request - fetch the table\n",
        "table = client.get_table(table_ref)\n",
        "\n",
        "# Preview the first five lines of the \"transactions\" table\n",
        "client.list_rows(table, max_results=5).to_dataframe()"
      ],
      "metadata": {
        "id": "PN9j763YI-rz",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 875
        },
        "outputId": "4e654ebb-f984-4c3f-846d-03a6c8920c6c"
      },
      "execution_count": 34,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "                                                hash  size  virtual_size  \\\n",
              "0  a16f3ce4dd5deb92d98ef5cf8afeaf0775ebca408f708b...   275           275   \n",
              "1  591e91f809d716912ca1d4a9295e70c3e78bab077683f7...   275           275   \n",
              "2  12b5633bad1f9c167d523ad1aa1947b2732a865bf5414e...   276           276   \n",
              "3  828ef3b079f9c23829c56fe86e85b4a69d9e06e5b54ea5...   276           276   \n",
              "4  35288d269cee1941eaebb2ea85e32b42cdb2b04284a56d...   277           277   \n",
              "\n",
              "   version  lock_time                                         block_hash  \\\n",
              "0        1          0  00000000dc55860c8a29c58d45209318fa9e9dc2c1833a...   \n",
              "1        1          0  0000000054487811fc4ff7a95be738aa5ad9320c394c48...   \n",
              "2        1          0  00000000f46e513f038baf6f2d9a95b2a28d8a6c985bcf...   \n",
              "3        1          0  00000000fb5b44edc7a1aa105075564a179d65506e2bd2...   \n",
              "4        1          0  00000000689051c09ff2cd091cc4c22c10b965eb8db3ad...   \n",
              "\n",
              "   block_number           block_timestamp block_timestamp_month  input_count  \\\n",
              "0           181 2009-01-12 06:02:13+00:00            2009-01-01            1   \n",
              "1           182 2009-01-12 06:12:16+00:00            2009-01-01            1   \n",
              "2           183 2009-01-12 06:34:22+00:00            2009-01-01            1   \n",
              "3           248 2009-01-12 20:04:20+00:00            2009-01-01            1   \n",
              "4           545 2009-01-15 05:48:32+00:00            2009-01-01            1   \n",
              "\n",
              "   output_count input_value output_value  is_coinbase fee  \\\n",
              "0             2  4000000000   4000000000        False   0   \n",
              "1             2  3000000000   3000000000        False   0   \n",
              "2             2  2900000000   2900000000        False   0   \n",
              "3             2  2800000000   2800000000        False   0   \n",
              "4             2  2500000000   2500000000        False   0   \n",
              "\n",
              "                                              inputs  \\\n",
              "0  [{'index': 0, 'spent_transaction_hash': 'f4184...   \n",
              "1  [{'index': 0, 'spent_transaction_hash': 'a16f3...   \n",
              "2  [{'index': 0, 'spent_transaction_hash': '591e9...   \n",
              "3  [{'index': 0, 'spent_transaction_hash': '12b56...   \n",
              "4  [{'index': 0, 'spent_transaction_hash': 'd71fd...   \n",
              "\n",
              "                                             outputs  \n",
              "0  [{'index': 0, 'script_asm': '04b5abd412d4341b4...  \n",
              "1  [{'index': 0, 'script_asm': '0401518fa1d1e1e3e...  \n",
              "2  [{'index': 0, 'script_asm': '04baa9d3665315562...  \n",
              "3  [{'index': 0, 'script_asm': '04bed827d37474bef...  \n",
              "4  [{'index': 0, 'script_asm': '044a656f065871a35...  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-2fe374f3-13af-4c84-916d-b6cd199645c5\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>hash</th>\n",
              "      <th>size</th>\n",
              "      <th>virtual_size</th>\n",
              "      <th>version</th>\n",
              "      <th>lock_time</th>\n",
              "      <th>block_hash</th>\n",
              "      <th>block_number</th>\n",
              "      <th>block_timestamp</th>\n",
              "      <th>block_timestamp_month</th>\n",
              "      <th>input_count</th>\n",
              "      <th>output_count</th>\n",
              "      <th>input_value</th>\n",
              "      <th>output_value</th>\n",
              "      <th>is_coinbase</th>\n",
              "      <th>fee</th>\n",
              "      <th>inputs</th>\n",
              "      <th>outputs</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>a16f3ce4dd5deb92d98ef5cf8afeaf0775ebca408f708b...</td>\n",
              "      <td>275</td>\n",
              "      <td>275</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>00000000dc55860c8a29c58d45209318fa9e9dc2c1833a...</td>\n",
              "      <td>181</td>\n",
              "      <td>2009-01-12 06:02:13+00:00</td>\n",
              "      <td>2009-01-01</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>4000000000</td>\n",
              "      <td>4000000000</td>\n",
              "      <td>False</td>\n",
              "      <td>0</td>\n",
              "      <td>[{'index': 0, 'spent_transaction_hash': 'f4184...</td>\n",
              "      <td>[{'index': 0, 'script_asm': '04b5abd412d4341b4...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>591e91f809d716912ca1d4a9295e70c3e78bab077683f7...</td>\n",
              "      <td>275</td>\n",
              "      <td>275</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0000000054487811fc4ff7a95be738aa5ad9320c394c48...</td>\n",
              "      <td>182</td>\n",
              "      <td>2009-01-12 06:12:16+00:00</td>\n",
              "      <td>2009-01-01</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>3000000000</td>\n",
              "      <td>3000000000</td>\n",
              "      <td>False</td>\n",
              "      <td>0</td>\n",
              "      <td>[{'index': 0, 'spent_transaction_hash': 'a16f3...</td>\n",
              "      <td>[{'index': 0, 'script_asm': '0401518fa1d1e1e3e...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>12b5633bad1f9c167d523ad1aa1947b2732a865bf5414e...</td>\n",
              "      <td>276</td>\n",
              "      <td>276</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>00000000f46e513f038baf6f2d9a95b2a28d8a6c985bcf...</td>\n",
              "      <td>183</td>\n",
              "      <td>2009-01-12 06:34:22+00:00</td>\n",
              "      <td>2009-01-01</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>2900000000</td>\n",
              "      <td>2900000000</td>\n",
              "      <td>False</td>\n",
              "      <td>0</td>\n",
              "      <td>[{'index': 0, 'spent_transaction_hash': '591e9...</td>\n",
              "      <td>[{'index': 0, 'script_asm': '04baa9d3665315562...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>828ef3b079f9c23829c56fe86e85b4a69d9e06e5b54ea5...</td>\n",
              "      <td>276</td>\n",
              "      <td>276</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>00000000fb5b44edc7a1aa105075564a179d65506e2bd2...</td>\n",
              "      <td>248</td>\n",
              "      <td>2009-01-12 20:04:20+00:00</td>\n",
              "      <td>2009-01-01</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>2800000000</td>\n",
              "      <td>2800000000</td>\n",
              "      <td>False</td>\n",
              "      <td>0</td>\n",
              "      <td>[{'index': 0, 'spent_transaction_hash': '12b56...</td>\n",
              "      <td>[{'index': 0, 'script_asm': '04bed827d37474bef...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>35288d269cee1941eaebb2ea85e32b42cdb2b04284a56d...</td>\n",
              "      <td>277</td>\n",
              "      <td>277</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>00000000689051c09ff2cd091cc4c22c10b965eb8db3ad...</td>\n",
              "      <td>545</td>\n",
              "      <td>2009-01-15 05:48:32+00:00</td>\n",
              "      <td>2009-01-01</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>2500000000</td>\n",
              "      <td>2500000000</td>\n",
              "      <td>False</td>\n",
              "      <td>0</td>\n",
              "      <td>[{'index': 0, 'spent_transaction_hash': 'd71fd...</td>\n",
              "      <td>[{'index': 0, 'script_asm': '044a656f065871a35...</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-2fe374f3-13af-4c84-916d-b6cd199645c5')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-2fe374f3-13af-4c84-916d-b6cd199645c5 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-2fe374f3-13af-4c84-916d-b6cd199645c5');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "application/vnd.google.colaboratory.module+javascript": "\n      import \"https://ssl.gstatic.com/colaboratory/data_table/f872b2c2305463fd/data_table.js\";\n\n      window.createDataTable({\n        data: [[{\n            'v': 0,\n            'f': \"0\",\n        },\n\"a16f3ce4dd5deb92d98ef5cf8afeaf0775ebca408f708b2146c4fb42b41e14be\",\n{\n            'v': 275,\n            'f': \"275\",\n        },\n{\n            'v': 275,\n            'f': \"275\",\n        },\n{\n            'v': 1,\n            'f': \"1\",\n        },\n{\n            'v': 0,\n            'f': \"0\",\n        },\n\"00000000dc55860c8a29c58d45209318fa9e9dc2c1833a7226d86bc465afc6e5\",\n{\n            'v': 181,\n            'f': \"181\",\n        },\n\"2009-01-12 06:02:13+00:00\",\n\"2009-01-01\",\n{\n            'v': 1,\n            'f': \"1\",\n        },\n{\n            'v': 2,\n            'f': \"2\",\n        },\n{\n            'v': \"4000000000\",\n            'f': \"\\\"4000000000\\\"\",\n        },\n{\n            'v': \"4000000000\",\n            'f': \"\\\"4000000000\\\"\",\n        },\nfalse,\n{\n            'v': \"0\",\n            'f': \"\\\"0\\\"\",\n        },\n[\"{'index': 0, 'spent_transaction_hash': 'f4184fc596403b9d638783cf57adfe4c75c605f6356fbc91338530e9831e9e16', 'spent_output_index': 1, 'script_asm': '3044022027542a94d6646c51240f23a76d33088d3dd8815b25e9ea18cac67d1171a3212e02203baf203c6e7b80ebd3e588628466ea28be572fe1aaa3f30947da4763dd3b3d2b[ALL]', 'script_hex': '473044022027542a94d6646c51240f23a76d33088d3dd8815b25e9ea18cac67d1171a3212e02203baf203c6e7b80ebd3e588628466ea28be572fe1aaa3f30947da4763dd3b3d2b01', 'sequence': 4294967295, 'required_signatures': 1, 'type': 'pubkey', 'addresses': ['12cbQLTFMXRnSzktFkuoG3eHoMeFtpTu3S'], 'value': Decimal('4000000000')}\"],\n[\"{'index': 0, 'script_asm': '04b5abd412d4341b45056d3e376cd446eca43fa871b51961330deebd84423e740daa520690e1d9e074654c59ff87b408db903649623e86f1ca5412786f61ade2bf OP_CHECKSIG', 'script_hex': '4104b5abd412d4341b45056d3e376cd446eca43fa871b51961330deebd84423e740daa520690e1d9e074654c59ff87b408db903649623e86f1ca5412786f61ade2bfac', 'required_signatures': 1, 'type': 'pubkey', 'addresses': ['1DUDsfc23Dv9sPMEk5RsrtfzCw5ofi5sVW'], 'value': Decimal('1000000000')}\", \"{'index': 1, 'script_asm': '0411db93e1dcdb8a016b49840f8c53bc1eb68a382e97b1482ecad7b148a6909a5cb2e0eaddfb84ccf9744464f82e160bfa9b8b64f9d4c03f999b8643f656b412a3 OP_CHECKSIG', 'script_hex': '410411db93e1dcdb8a016b49840f8c53bc1eb68a382e97b1482ecad7b148a6909a5cb2e0eaddfb84ccf9744464f82e160bfa9b8b64f9d4c03f999b8643f656b412a3ac', 'required_signatures': 1, 'type': 'pubkey', 'addresses': ['12cbQLTFMXRnSzktFkuoG3eHoMeFtpTu3S'], 'value': Decimal('3000000000')}\"]],\n [{\n            'v': 1,\n            'f': \"1\",\n        },\n\"591e91f809d716912ca1d4a9295e70c3e78bab077683f79350f101da64588073\",\n{\n            'v': 275,\n            'f': \"275\",\n        },\n{\n            'v': 275,\n            'f': \"275\",\n        },\n{\n            'v': 1,\n            'f': \"1\",\n        },\n{\n            'v': 0,\n            'f': \"0\",\n        },\n\"0000000054487811fc4ff7a95be738aa5ad9320c394c482b27c0da28b227ad5d\",\n{\n            'v': 182,\n            'f': \"182\",\n        },\n\"2009-01-12 06:12:16+00:00\",\n\"2009-01-01\",\n{\n            'v': 1,\n            'f': \"1\",\n        },\n{\n            'v': 2,\n            'f': \"2\",\n        },\n{\n            'v': \"3000000000\",\n            'f': \"\\\"3000000000\\\"\",\n        },\n{\n            'v': \"3000000000\",\n            'f': \"\\\"3000000000\\\"\",\n        },\nfalse,\n{\n            'v': \"0\",\n            'f': \"\\\"0\\\"\",\n        },\n[\"{'index': 0, 'spent_transaction_hash': 'a16f3ce4dd5deb92d98ef5cf8afeaf0775ebca408f708b2146c4fb42b41e14be', 'spent_output_index': 1, 'script_asm': '304402201f27e51caeb9a0988a1e50799ff0af94a3902403c3ad4068b063e7b4d1b0a76702206713f69bd344058b0dee55a9798759092d0916dbbc3e592fee43060005ddc174[ALL]', 'script_hex': '47304402201f27e51caeb9a0988a1e50799ff0af94a3902403c3ad4068b063e7b4d1b0a76702206713f69bd344058b0dee55a9798759092d0916dbbc3e592fee43060005ddc17401', 'sequence': 4294967295, 'required_signatures': 1, 'type': 'pubkey', 'addresses': ['12cbQLTFMXRnSzktFkuoG3eHoMeFtpTu3S'], 'value': Decimal('3000000000')}\"],\n[\"{'index': 0, 'script_asm': '0401518fa1d1e1e3e162852d68d9be1c0abad5e3d6297ec95f1f91b909dc1afe616d6876f92918451ca387c4387609ae1a895007096195a824baf9c38ea98c09c3 OP_CHECKSIG', 'script_hex': '410401518fa1d1e1e3e162852d68d9be1c0abad5e3d6297ec95f1f91b909dc1afe616d6876f92918451ca387c4387609ae1a895007096195a824baf9c38ea98c09c3ac', 'required_signatures': 1, 'type': 'pubkey', 'addresses': ['1LzBzVqEeuQyjD2mRWHes3dgWrT9titxvq'], 'value': Decimal('100000000')}\", \"{'index': 1, 'script_asm': '0411db93e1dcdb8a016b49840f8c53bc1eb68a382e97b1482ecad7b148a6909a5cb2e0eaddfb84ccf9744464f82e160bfa9b8b64f9d4c03f999b8643f656b412a3 OP_CHECKSIG', 'script_hex': '410411db93e1dcdb8a016b49840f8c53bc1eb68a382e97b1482ecad7b148a6909a5cb2e0eaddfb84ccf9744464f82e160bfa9b8b64f9d4c03f999b8643f656b412a3ac', 'required_signatures': 1, 'type': 'pubkey', 'addresses': ['12cbQLTFMXRnSzktFkuoG3eHoMeFtpTu3S'], 'value': Decimal('2900000000')}\"]],\n [{\n            'v': 2,\n            'f': \"2\",\n        },\n\"12b5633bad1f9c167d523ad1aa1947b2732a865bf5414eab2f9e5ae5d5c191ba\",\n{\n            'v': 276,\n            'f': \"276\",\n        },\n{\n            'v': 276,\n            'f': \"276\",\n        },\n{\n            'v': 1,\n            'f': \"1\",\n        },\n{\n            'v': 0,\n            'f': \"0\",\n        },\n\"00000000f46e513f038baf6f2d9a95b2a28d8a6c985bcf24b9e07f0f63a29888\",\n{\n            'v': 183,\n            'f': \"183\",\n        },\n\"2009-01-12 06:34:22+00:00\",\n\"2009-01-01\",\n{\n            'v': 1,\n            'f': \"1\",\n        },\n{\n            'v': 2,\n            'f': \"2\",\n        },\n{\n            'v': \"2900000000\",\n            'f': \"\\\"2900000000\\\"\",\n        },\n{\n            'v': \"2900000000\",\n            'f': \"\\\"2900000000\\\"\",\n        },\nfalse,\n{\n            'v': \"0\",\n            'f': \"\\\"0\\\"\",\n        },\n[\"{'index': 0, 'spent_transaction_hash': '591e91f809d716912ca1d4a9295e70c3e78bab077683f79350f101da64588073', 'spent_output_index': 1, 'script_asm': '3045022052ffc1929a2d8bd365c6a2a4e3421711b4b1e1b8781698ca9075807b4227abcb0221009984107ddb9e3813782b095d0d84361ed4c76e5edaf6561d252ae162c2341cfb[ALL]', 'script_hex': '483045022052ffc1929a2d8bd365c6a2a4e3421711b4b1e1b8781698ca9075807b4227abcb0221009984107ddb9e3813782b095d0d84361ed4c76e5edaf6561d252ae162c2341cfb01', 'sequence': 4294967295, 'required_signatures': 1, 'type': 'pubkey', 'addresses': ['12cbQLTFMXRnSzktFkuoG3eHoMeFtpTu3S'], 'value': Decimal('2900000000')}\"],\n[\"{'index': 0, 'script_asm': '04baa9d36653155627c740b3409a734d4eaf5dcca9fb4f736622ee18efcf0aec2b758b2ec40db18fbae708f691edb2d4a2a3775eb413d16e2e3c0f8d4c69119fd1 OP_CHECKSIG', 'script_hex': '4104baa9d36653155627c740b3409a734d4eaf5dcca9fb4f736622ee18efcf0aec2b758b2ec40db18fbae708f691edb2d4a2a3775eb413d16e2e3c0f8d4c69119fd1ac', 'required_signatures': 1, 'type': 'pubkey', 'addresses': ['13HtsYzne8xVPdGDnmJX8gHgBZerAfJGEf'], 'value': Decimal('100000000')}\", \"{'index': 1, 'script_asm': '0411db93e1dcdb8a016b49840f8c53bc1eb68a382e97b1482ecad7b148a6909a5cb2e0eaddfb84ccf9744464f82e160bfa9b8b64f9d4c03f999b8643f656b412a3 OP_CHECKSIG', 'script_hex': '410411db93e1dcdb8a016b49840f8c53bc1eb68a382e97b1482ecad7b148a6909a5cb2e0eaddfb84ccf9744464f82e160bfa9b8b64f9d4c03f999b8643f656b412a3ac', 'required_signatures': 1, 'type': 'pubkey', 'addresses': ['12cbQLTFMXRnSzktFkuoG3eHoMeFtpTu3S'], 'value': Decimal('2800000000')}\"]],\n [{\n            'v': 3,\n            'f': \"3\",\n        },\n\"828ef3b079f9c23829c56fe86e85b4a69d9e06e5b54ea597eef5fb3ffef509fe\",\n{\n            'v': 276,\n            'f': \"276\",\n        },\n{\n            'v': 276,\n            'f': \"276\",\n        },\n{\n            'v': 1,\n            'f': \"1\",\n        },\n{\n            'v': 0,\n            'f': \"0\",\n        },\n\"00000000fb5b44edc7a1aa105075564a179d65506e2bd25f55f1629251d0f6b0\",\n{\n            'v': 248,\n            'f': \"248\",\n        },\n\"2009-01-12 20:04:20+00:00\",\n\"2009-01-01\",\n{\n            'v': 1,\n            'f': \"1\",\n        },\n{\n            'v': 2,\n            'f': \"2\",\n        },\n{\n            'v': \"2800000000\",\n            'f': \"\\\"2800000000\\\"\",\n        },\n{\n            'v': \"2800000000\",\n            'f': \"\\\"2800000000\\\"\",\n        },\nfalse,\n{\n            'v': \"0\",\n            'f': \"\\\"0\\\"\",\n        },\n[\"{'index': 0, 'spent_transaction_hash': '12b5633bad1f9c167d523ad1aa1947b2732a865bf5414eab2f9e5ae5d5c191ba', 'spent_output_index': 1, 'script_asm': '3045022100c12a7d54972f26d14cb311339b5122f8c187417dde1e8efb6841f55c34220ae0022066632c5cd4161efa3a2837764eee9eb84975dd54c2de2865e9752585c53e7cce[ALL]', 'script_hex': '483045022100c12a7d54972f26d14cb311339b5122f8c187417dde1e8efb6841f55c34220ae0022066632c5cd4161efa3a2837764eee9eb84975dd54c2de2865e9752585c53e7cce01', 'sequence': 4294967295, 'required_signatures': 1, 'type': 'pubkey', 'addresses': ['12cbQLTFMXRnSzktFkuoG3eHoMeFtpTu3S'], 'value': Decimal('2800000000')}\"],\n[\"{'index': 0, 'script_asm': '04bed827d37474beffb37efe533701ac1f7c600957a4487be8b371346f016826ee6f57ba30d88a472a0e4ecd2f07599a795f1f01de78d791b382e65ee1c58b4508 OP_CHECKSIG', 'script_hex': '4104bed827d37474beffb37efe533701ac1f7c600957a4487be8b371346f016826ee6f57ba30d88a472a0e4ecd2f07599a795f1f01de78d791b382e65ee1c58b4508ac', 'required_signatures': 1, 'type': 'pubkey', 'addresses': ['1ByLSV2gLRcuqUmfdYcpPQH8Npm8cccsFg'], 'value': Decimal('1000000000')}\", \"{'index': 1, 'script_asm': '0411db93e1dcdb8a016b49840f8c53bc1eb68a382e97b1482ecad7b148a6909a5cb2e0eaddfb84ccf9744464f82e160bfa9b8b64f9d4c03f999b8643f656b412a3 OP_CHECKSIG', 'script_hex': '410411db93e1dcdb8a016b49840f8c53bc1eb68a382e97b1482ecad7b148a6909a5cb2e0eaddfb84ccf9744464f82e160bfa9b8b64f9d4c03f999b8643f656b412a3ac', 'required_signatures': 1, 'type': 'pubkey', 'addresses': ['12cbQLTFMXRnSzktFkuoG3eHoMeFtpTu3S'], 'value': Decimal('1800000000')}\"]],\n [{\n            'v': 4,\n            'f': \"4\",\n        },\n\"35288d269cee1941eaebb2ea85e32b42cdb2b04284a56d8b14dcc3f5c65d6055\",\n{\n            'v': 277,\n            'f': \"277\",\n        },\n{\n            'v': 277,\n            'f': \"277\",\n        },\n{\n            'v': 1,\n            'f': \"1\",\n        },\n{\n            'v': 0,\n            'f': \"0\",\n        },\n\"00000000689051c09ff2cd091cc4c22c10b965eb8db3ad5f032621cc36626175\",\n{\n            'v': 545,\n            'f': \"545\",\n        },\n\"2009-01-15 05:48:32+00:00\",\n\"2009-01-01\",\n{\n            'v': 1,\n            'f': \"1\",\n        },\n{\n            'v': 2,\n            'f': \"2\",\n        },\n{\n            'v': \"2500000000\",\n            'f': \"\\\"2500000000\\\"\",\n        },\n{\n            'v': \"2500000000\",\n            'f': \"\\\"2500000000\\\"\",\n        },\nfalse,\n{\n            'v': \"0\",\n            'f': \"\\\"0\\\"\",\n        },\n[\"{'index': 0, 'spent_transaction_hash': 'd71fd2f64c0b34465b7518d240c00e83f6a5b10138a7079d1252858fe7e6b577', 'spent_output_index': 0, 'script_asm': '304602210083ec8bd391269f00f3d714a54f4dbd6b8004b3e9c91f3078ff4fca42da456f4d0221008dfe1450870a717f59a494b77b36b7884381233555f8439dac4ea969977dd3f4[ALL]', 'script_hex': '49304602210083ec8bd391269f00f3d714a54f4dbd6b8004b3e9c91f3078ff4fca42da456f4d0221008dfe1450870a717f59a494b77b36b7884381233555f8439dac4ea969977dd3f401', 'sequence': 4294967295, 'required_signatures': 1, 'type': 'pubkey', 'addresses': ['1DCbY2GYVaAMCBpuBNN5GVg3a47pNK1wdi'], 'value': Decimal('2500000000')}\"],\n[\"{'index': 0, 'script_asm': '044a656f065871a353f216ca26cef8dde2f03e8c16202d2e8ad769f02032cb86a5eb5e56842e92e19141d60a01928f8dd2c875a390f67c1f6c94cfc617c0ea45af OP_CHECKSIG', 'script_hex': '41044a656f065871a353f216ca26cef8dde2f03e8c16202d2e8ad769f02032cb86a5eb5e56842e92e19141d60a01928f8dd2c875a390f67c1f6c94cfc617c0ea45afac', 'required_signatures': 1, 'type': 'pubkey', 'addresses': ['1DZTzaBHUDM7T3QvUKBz4qXMRpkg8jsfB5'], 'value': Decimal('100000000')}\", \"{'index': 1, 'script_asm': '04f36c67039006ec4ed2c885d7ab0763feb5deb9633cf63841474712e4cf0459356750185fc9d962d0f4a1e08e1a84f0c9a9f826ad067675403c19d752530492dc OP_CHECKSIG', 'script_hex': '4104f36c67039006ec4ed2c885d7ab0763feb5deb9633cf63841474712e4cf0459356750185fc9d962d0f4a1e08e1a84f0c9a9f826ad067675403c19d752530492dcac', 'required_signatures': 1, 'type': 'pubkey', 'addresses': ['1DCbY2GYVaAMCBpuBNN5GVg3a47pNK1wdi'], 'value': Decimal('2400000000')}\"]]],\n        columns: [[\"number\", \"index\"], [\"string\", \"hash\"], [\"number\", \"size\"], [\"number\", \"virtual_size\"], [\"number\", \"version\"], [\"number\", \"lock_time\"], [\"string\", \"block_hash\"], [\"number\", \"block_number\"], [\"string\", \"block_timestamp\"], [\"string\", \"block_timestamp_month\"], [\"number\", \"input_count\"], [\"number\", \"output_count\"], [\"number\", \"input_value\"], [\"number\", \"output_value\"], [\"string\", \"is_coinbase\"], [\"number\", \"fee\"], [\"string\", \"inputs\"], [\"string\", \"outputs\"]],\n        columnOptions: [{\"width\": \"1px\", \"className\": \"index_column\"}],\n        rowsPerPage: 25,\n        helpUrl: \"https://colab.research.google.com/notebooks/data_table.ipynb\",\n        suppressOutputScrolling: true,\n        minimumWidth: undefined,\n      });\n    "
          },
          "metadata": {},
          "execution_count": 34
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "Since the `block_timestamp` column contains the date of each transaction in DATETIME format, we'll convert these into DATE format using the **DATE()** command.\n",
        "\n",
        "We do that using a CTE, and then the next part of the query counts the number of transactions for each date and sorts the table so that earlier dates appear first. "
      ],
      "metadata": {
        "id": "afJXEFTGK1vO"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Query to select the number of transactions per date, sorted by date\n",
        "query_with_CTE = \"\"\" \n",
        "                 WITH time AS \n",
        "                 (\n",
        "                     SELECT DATE(block_timestamp) AS trans_date\n",
        "                     FROM `bigquery-public-data.crypto_bitcoin.transactions`\n",
        "                 )\n",
        "                 SELECT COUNT(1) AS transactions,\n",
        "                        trans_date\n",
        "                 FROM time\n",
        "                 GROUP BY trans_date\n",
        "                 ORDER BY trans_date\n",
        "                 \"\"\"\n",
        "\n",
        "# Set up the query (cancel the query if it would use too much of \n",
        "# your quota, with the limit set to 10 GB)\n",
        "safe_config = bigquery.QueryJobConfig(maximum_bytes_billed=10**10)\n",
        "query_job = client.query(query_with_CTE, job_config=safe_config)\n",
        "\n",
        "# API request - run the query, and convert the results to a pandas DataFrame\n",
        "transactions_by_date = query_job.to_dataframe()\n",
        "\n",
        "# Print the first five rows\n",
        "transactions_by_date.head()"
      ],
      "metadata": {
        "id": "yvWGKLH2B8Qa",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 197
        },
        "outputId": "4461e353-211d-44a1-945b-07f6317ffb70"
      },
      "execution_count": 35,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "   transactions  trans_date\n",
              "0             1  2009-01-03\n",
              "1            14  2009-01-09\n",
              "2            61  2009-01-10\n",
              "3            93  2009-01-11\n",
              "4           101  2009-01-12"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-88f87ecc-dcea-411b-a193-ee2c28f6a63e\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>transactions</th>\n",
              "      <th>trans_date</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>1</td>\n",
              "      <td>2009-01-03</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>14</td>\n",
              "      <td>2009-01-09</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>61</td>\n",
              "      <td>2009-01-10</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>93</td>\n",
              "      <td>2009-01-11</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>101</td>\n",
              "      <td>2009-01-12</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-88f87ecc-dcea-411b-a193-ee2c28f6a63e')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-88f87ecc-dcea-411b-a193-ee2c28f6a63e button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-88f87ecc-dcea-411b-a193-ee2c28f6a63e');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "application/vnd.google.colaboratory.module+javascript": "\n      import \"https://ssl.gstatic.com/colaboratory/data_table/f872b2c2305463fd/data_table.js\";\n\n      window.createDataTable({\n        data: [[{\n            'v': 0,\n            'f': \"0\",\n        },\n{\n            'v': 1,\n            'f': \"1\",\n        },\n\"2009-01-03\"],\n [{\n            'v': 1,\n            'f': \"1\",\n        },\n{\n            'v': 14,\n            'f': \"14\",\n        },\n\"2009-01-09\"],\n [{\n            'v': 2,\n            'f': \"2\",\n        },\n{\n            'v': 61,\n            'f': \"61\",\n        },\n\"2009-01-10\"],\n [{\n            'v': 3,\n            'f': \"3\",\n        },\n{\n            'v': 93,\n            'f': \"93\",\n        },\n\"2009-01-11\"],\n [{\n            'v': 4,\n            'f': \"4\",\n        },\n{\n            'v': 101,\n            'f': \"101\",\n        },\n\"2009-01-12\"]],\n        columns: [[\"number\", \"index\"], [\"number\", \"transactions\"], [\"string\", \"trans_date\"]],\n        columnOptions: [{\"width\": \"1px\", \"className\": \"index_column\"}],\n        rowsPerPage: 25,\n        helpUrl: \"https://colab.research.google.com/notebooks/data_table.ipynb\",\n        suppressOutputScrolling: true,\n        minimumWidth: undefined,\n      });\n    "
          },
          "metadata": {},
          "execution_count": 35
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "Since they're returned sorted, we can easily plot the raw results to show us the number of Bitcoin transactions per day over the whole timespan of this dataset."
      ],
      "metadata": {
        "id": "gqgnFoJ1LMYf"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "transactions_by_date.set_index('trans_date').plot()"
      ],
      "metadata": {
        "id": "qXRnbHIELMzn",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 297
        },
        "outputId": "23c9be13-57b8-4aae-85b2-c0f11474af79"
      },
      "execution_count": 36,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<matplotlib.axes._subplots.AxesSubplot at 0x7f81f8333790>"
            ]
          },
          "metadata": {},
          "execution_count": 36
        },
        {
          "output_type": "display_data",
          "data": {
            "text/plain": [
              "<Figure size 432x288 with 1 Axes>"
            ],
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYkAAAEHCAYAAABbZ7oVAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nO3dd3xV9f348dc7IUDCHiGMsARkG4TIEFEQgQAq1G1txX4dX6ut1tYqft1alVp/1uJe1NViq1WggiKgiIOVMGRK2IQ9wwoQks/vj3tuuLm5O+fckbyfPnhw7+d8zud8Erznfc9nijEGpZRSypekWFdAKaVU/NIgoZRSyi8NEkoppfzSIKGUUsovDRJKKaX8qhHrCtitadOmpl27drGuhlJKJZS8vLx9xph07/QqFyTatWtHbm5urKuhlFIJRUS2+ErX5iallFJ+hRQkRGSziKwQkWUikmulNRaRWSKSb/3dyEoXEZkoIutF5EcR6e1Rzjgrf76IjPNI72OVv946VwJdQymlVHSE8yQxxBjTyxiTbb0fD8wxxnQC5ljvAUYCnaw/twGvguuGDzwK9AP6Ao963PRfBW71OC8nyDWUUkpFQWX6JMYAg63X7wJzgfut9PeMa72PBSLSUERaWHlnGWMOAIjILCBHROYC9Y0xC6z094CxwOcBrqGUSkDFxcUUFBRw4sSJWFel2qpduzaZmZmkpKSElD/UIGGAL0XEAK8bY94AMowxO63ju4AM63UrYJvHuQVWWqD0Ah/pBLiGUioBFRQUUK9ePdq1a4fVqqyiyBjD/v37KSgooH379iGdE2qQuMAYs11EmgGzRGSt14WNFUAcE+gaInIbrqYt2rRp42Q1lFKVcOLECQ0QMSQiNGnShL1794Z8Tkh9EsaY7dbfe4BPcfUp7LaakbD+3mNl3w609jg900oLlJ7pI50A1/Cu3xvGmGxjTHZ6eoVhvkqpOKIBIrbC/f0HDRIiUkdE6rlfA8OBlcA0wD1CaRww1Xo9DbjRGuXUHyi0moxmAsNFpJHVYT0cmGkdOywi/a1RTTd6leXrGkqpGPth/T5OFJfEuhrKYaE8SWQA34nIcmARMN0Y8wUwARgmIvnAJdZ7gBnARmA98CZwB4DVYf0ksNj684S7E9vK85Z1zgZcndYEuIZSKoY27D3Kz99ayENTVsa6KmE5dOgQr7zySqyrUc4777zDjh07yt7fcsstrF69OoY1Ki9on4QxZiOQ5SN9PzDUR7oB7vRT1iRgko/0XKBHqNdQSsXW4aJiAPL3HI1xTcLjDhJ33HFHufTTp09To0ZsFqB455136NGjBy1btgTgrbfeikk9/NEZ10qpamP8+PFs2LCBXr16cd555zFo0CAuv/xyunXrBsDYsWPp06cP3bt354033ig7r27dujz44INkZWXRv39/du/eDcBHH31Ejx49yMrK4sILLwRg8+bNDBo0iN69e9O7d29++OGHsnL+/Oc/07NnT7Kyshg/fjwff/wxubm53HDDDfTq1YuioiIGDx5ctrTQ5MmT6dmzJz169OD++++PqD6VVeXWblJKJYbH/7uK1TsO21pmt5b1efSy7n6PT5gwgZUrV7Js2TLmzp3L6NGjWblyZdlw0EmTJtG4cWOKioo477zzuPLKK2nSpAnHjh2jf//+PPXUU9x33328+eabPPTQQzzxxBPMnDmTVq1acejQIQCaNWvGrFmzqF27Nvn5+Vx//fXk5uby+eefM3XqVBYuXEhaWhoHDhygcePGvPTSSzz33HNkZ2eXq+uOHTu4//77ycvLo1GjRgwfPpwpU6YwduzYsOpTWfokoZQKm6Pj3aOob9++5eYLTJw4sezb+bZt28jPzwegZs2aXHrppQD06dOHzZs3AzBw4EBuuukm3nzzTUpKXJ34xcXF3HrrrfTs2ZOrr766rH9h9uzZ/OpXvyItLQ2Axo0bB6zb4sWLGTx4MOnp6dSoUYMbbriBefPmhV2fytInCaVUTAT6xh8tderUKXs9d+5cZs+ezfz580lLS2Pw4MFlM8NTUlLKho4mJydz+vRpAF577TUWLlzI9OnT6dOnD3l5ebz44otkZGSwfPlySktLqV27tu31Dqc+TZo0qdS19ElCKRW2RJ3pUK9ePY4cOeLzWGFhIY0aNSItLY21a9eyYMGCoOVt2LCBfv368cQTT5Cens62bdsoLCykRYsWJCUl8f7775d9ox82bBh///vfOX78OAAHDhwIWKe+ffvyzTffsG/fPkpKSpg8eTIXXXRR2PWpLH2SUEqFLVGbm5o0acLAgQPp0aMHqampZGScWeknJyeH1157ja5du9K5c2f69+8ftLw//vGP5OfnY4xh6NChZGVlcccdd3DllVfy3nvvkZOTU/a0kpOTw7Jly8jOzqZmzZqMGjWKp59+mptuuonbb7+d1NRU5s+fX1Z2ixYtmDBhAkOGDMEYw+jRoxkzZkzY9akscY1YrTqys7ONbjqklLOWbD3IFa/8QFbrhky9c2DI561Zs4auXbs6WDMVCl//DiKS57HKdxltblJKhS1Rm5tU+DRIKKXCVrXaH1QgGiSUUlFV1Zq4E024v38NEkqpqKlduzb79+/XQBEj7v0kwhmWq6OblFIRC7dvIjMzk4KCgrD2M1D2cu9MFyoNEkqpiIX7PJCSkhLyjmgBr2s9iejeFM7T5ialVMK57+Mfaf/AjFhXo1rQIKGUilisvsd/lFcQoytXPxoklFIR0+7nqk+DhFIqbNsPFsW6CipKNEgopcL228lLAZ15XR1okFBKKeWXBgmllFJ+aZBQSinllwYJpZRSfmmQUEop5ZcGCaWUUn5pkFBKRWzZtkPcZQ2HVVWTBgmlVKVMW74jZtfWJcedp0FCKZWwNEY4T4OEUiphaYxwngYJpVTC0uYm52mQUEop5ZcGCaVUwtLnCOdpkFBKJSx3a9NzM38ib8vB2Famigo5SIhIsogsFZHPrPftRWShiKwXkX+JSE0rvZb1fr11vJ1HGQ9Y6T+JyAiP9Bwrbb2IjPdI93kNpZQCGD3xWx6espKXvl7Pla/+EOvqVEnhPEncDazxeP9n4K/GmI7AQeBmK/1m4KCV/lcrHyLSDbgO6A7kAK9YgScZeBkYCXQDrrfyBrqGUqoaKjxeTLvx08ve5+85yvsLtsSwRhX9J6+ADXuPxroatgkpSIhIJjAaeMt6L8DFwMdWlneBsdbrMdZ7rONDrfxjgA+NMSeNMZuA9UBf6896Y8xGY8wp4ENgTJBrKKWqod9MXhLrKgT1h4+Wk/PCvFhXwzahPkm8ANwHlFrvmwCHjDGnrfcFQCvrdStgG4B1vNDKX5budY6/9EDXKEdEbhORXBHJ3bt3b4g/klIqEkWnSmJ27UTZNrW4pOp0qQcNEiJyKbDHGJMXhfpExBjzhjEm2xiTnZ6eHuvqKFWlHTlZHLNr7yhMjCBRldQIIc9A4HIRGQXUBuoDfwMaikgN65t+JrDdyr8daA0UiEgNoAGw3yPdzfMcX+n7A1xDKVUNnSguDZ5J2Srok4Qx5gFjTKYxph2ujuevjDE3AF8DV1nZxgFTrdfTrPdYx78yrmmR04DrrNFP7YFOwCJgMdDJGslU07rGNOscf9dQSsVIkkisq6CiqDLzJO4Hfi8i63H1H7xtpb8NNLHSfw+MBzDGrAL+DawGvgDuNMaUWE8JvwFm4ho99W8rb6BrKKViREOEf1VxmZBQmpvKGGPmAnOt1xtxjUzyznMCuNrP+U8BT/lInwHM8JHu8xpKKRWPqmCM0BnXSqnwlFbBG2Egd01eylvfbgwpb2kVjBIaJJRSYamKTSqBTFu+gz9NXxM8I1UzgGqQUEqFJZ7vg3lbDjh+jdzNB9h/9GS5tIUb99Nu/HTW7jrs+PWjTYOEUios8dyksu/oKcevcdVr87nm9fnl0j5fuQuAy1/63vHrR5sGCaVUWGI54zqYklLD4RPFjjeJbdh7rNz7qjwqOKzRTUqpqmf/0ZPk7zlK/7OahJT/xkmLHK5R5O7+cGnZkhg9WzXgv7+9IGD+nYVF1KlVg5ISQ6M6gReZ/sfCLdzQr23Z+2tem8+Rk6f5/O5Bla94HNMnCaWquWvfWMB1byzgw0VbQ8pfEIP1kz7K3Ubfp2YHzee5ZtKK7YUB8x44dooBz3zFOY99yblPzgpa9oOfrmTbgeNl7xdtPsCanf77INqNn84PG/YFLTfeaZBQqppbv8e1rPX4T1bEuCb+jf9kBXuOnAyeMQyHjofffzHo2a99poufKYa//efSsK8RbzRIKFWNlSbImM1I+xjGBWgas+snN8Yw6ftNPo8VlyT+WlMaJJSqxopL7bmJOR1sIi39m3X+tw6wq2/7p91H/B47fOJ02evv8vexMkgTWDzSIKFUNeavmcSXtbsOM3v1bp/HTjn8jTkao24j3SP71OnAP/vzX/4EwC/eXsilL34X0TViSYOEUtVYOHMecl74llvey3WwNhUZY3j9mw2VKsOzs9mr9HLvNu07ViHHjwWHKnVtgIlfra90GbGkQUKpamz+xv22lOPUN/3PftzJM5+vrVQZ/1i4lZ92VWwS8q6zr36PCSFcO47nFtpCg4RS1cSgZ7/imRnl1yAqsWmbTadmYfv6dh+u177ZwIgX5nHPv5Zx9OSZPgLvGs9ctavS16qKNEgoVcUdP3WarfuPs+1AEa/PK7+aaaBb+48Fh3hv/uaQrhHPS3W4fbp0O/9YsAWAfUdP8tnyHeWOz16zp8I5P2yw50krkemMa6WquJsmLWbR5jML383fsJ8BHYLPrnavQ3TjgHZB8ybISNoyFz83t9zII2+//iCP6/q2se16ew6fsK2saNMnCaWqOM8AAXD9mws4ZjW7eI/jj3Q+QqIsH+6uZaAAcaK4hM9X7go4xyJcv/7HEtvKijYNEkpVQ8etRfoenrKyXPqW/f5GAgXm1JNELNbN6/LwF2HlH/Ny8JVfC4uKI61OzGmQUKoa2rLf1SG8/1j5pSlCueH5kgh9EgDvz9/CZhs6w6sTDRJKVUPe+yG4RfqNN1GCxPZDRVzt52ePR//36Qqen7UupnXQIKFUNWR381CCxAgA9tq8UGAo3IsohuufC7cycU6+zbUJjwYJpaqoaHYmO/UkUZU380kUGiSUqmL2HjlJ3paDtH9gBt/m+1/gzs4VSp3ouF6z8zDPfRndppZFm5zfIzvR6DwJpaqY8zw25/nl2/6Hcb78tX1rCjmxCuzIv31re5nBFByMbHSX3Xo98SWDz06PdTUAfZJQqtrad9S+tvlE6pMI5D9LCmJdBQAOHS9myrIdwTNGgQYJpaqpcJYJDyZRRjcF8/16XYbDmwYJpaqpQJvlhCsWQaJVw9SoX7M60iChVDUVaift0q3BN+N56zvf23c64az0OgCk16sVtWtWZxoklFIBvTd/S9A8/1y4NQo1cUmyxsUmagPXe/M3hz2yrN346Rw6fip4RgdokFBKBRRv/Q3uZqbRPZvHuCaReWTqKl6MYLe6FTHaH1uDhFIqoFjEiB/W7/N7LKN+LVY9PoJbB50VxRrZy/+WqvEnaJAQkdoiskhElovIKhF53EpvLyILRWS9iPxLRGpa6bWs9+ut4+08ynrASv9JREZ4pOdYaetFZLxHus9rKKWixxD9pcBfCLAUxfFTJdSpVQNJ4OnYp077b27y97v+dMl2ThSXOFUlv0J5kjgJXGyMyQJ6ATki0h/4M/BXY0xH4CBws5X/ZuCglf5XKx8i0g24DugO5ACviEiyiCQDLwMjgW7A9VZeAlxDKRUlpcZw8HjFhf86Navr2DUDdapPX7HTsetGS6AmPH+HPlm6nWe/+MmhGvkXNEgYF/fqVCnWHwNcDHxspb8LjLVej7HeYx0fKq6QPwb40Bhz0hizCVgP9LX+rDfGbDTGnAI+BMZY5/i7hlIqWgx8tbbi1p52O3m6hBfn5Af8lg1VY+JeSYAZ6oF+vN1Hor/DXUjLcljf9vOAjri+9W8ADhlj3Ns7FQCtrNetgG0AxpjTIlIINLHSF3gU63nONq/0ftY5/q7hXb/bgNsA2rSxb8tBpRQYP7ctu1t73v5uE/9v1jpOJ9peqBGIdDBALHYADKnj2hhTYozpBWTi+ubfxdFahckY84YxJtsYk52eHh/rnShVVZSWwhIfcyXsnLENlDWl7CwssrXceBQoDgYKBLF4igprdJMx5hDwNTAAaCgi7ieRTGC79Xo70BrAOt4A2O+Z7nWOv/T9Aa6hlIqSTfuORXUexM7C8JtUzmvXyIGaOCdQc1PBQf9Bssir47roVAn9np4dcLXfygpldFO6iDS0XqcCw4A1uILFVVa2ccBU6/U06z3W8a+MKzROA66zRj+1BzoBi4DFQCdrJFNNXJ3b06xz/F1DKRUl/pbvcGpw0bf5/oe/elv1+Aim33UBH91+vjOVcYi/5qZPlxbw8NSVPo8BzP1pr9f7Pew+fJJ7/rXc1vp5CqVPogXwrtUvkQT82xjzmYisBj4UkT8BS4G3rfxvA++LyHrgAK6bPsaYVSLyb2A1cBq40xhTAiAivwFmAsnAJGPMKqus+/1cQynlEGNM1IeXRtrWXqdWDbq3bGBzbZznK0jsOXIirJv9Va/+QO4WVzOgnSv6egsaJIwxPwLn+kjfiKt/wjv9BHC1n7KeAp7ykT4DmBHqNZRSzjEm+jvCVYO+6nJ8NTcFaoJy65xRr+y1O0C4/emz1Tx0aTfvUypNZ1wrpcoJdeSNnU8bsRi14/Z/o6I/DqfUxyjfUAYC1Kvt/3u9U4ssapBQqgqx42br3e7t5LXc7HiS+Mct/SI677YLO1T+4mHyFYhLQvh9Gly7AJ48Hb2Z1xoklIpDKwoKOXyi4iznaLjlvdyoX9OORQQbpKbYUJPo8BUQQtkCNm/LQa567Qc6P/SFE9XySYOEUnHGGMNlL33HTZP870/t7zy7VgoNZSlre5ubbCsqIfgKCKH+DpZsPWRzbQLTIKFUnHHfLMK9Gfxz0VYuf+l7W+qQtyX4RkN2hIhl2w7x7Bdr/c7qjtSQzq5JtSO6Z9C9ZX1bymzTOM2WcsB389reo9FfciMUIS3LoZSKnkibXtbsPGxbHZJCeEqw40Fi7MuuoHbHkI5+8ww4qwnzNwbfe7pL83qM6tmck8WlTLz+XHYdPkGrhqlc9doPla8okJxk35OTr5FMV74637by7aRBQqk4EyhEHD5RzCtfb+APw88mJbl8Q4CdTTbXvB7dG9aW/cf8HntrXDbbDh4n54VvA5ZRIzmJV27oU/a+Q7p9q9Q2qVOT+gFGFoUr3jZyCkSbm5SKM4HuH/9v5k+89s0GpiytuEJNKOPs49Xoid/5PVanVg26NLenyShSIvDaL/vw4KiutpSXSP9WGiSUijPe7fPGmLKdzNxr9/j6Juq9ro/TEnjPn7D9enBHWjRI5dYL7dkNL3/P0bLXRadKuPOfS2wp1wkaJJSKktJSw8FjwTez977/f7h4G4Oe/Zq8LQfLOjx9TbwKpR/BLoM6NY343Giv8ur+Xb32iz4se2RYRGXcdH67stdpNZPtqFaZro98wfQf43cjJQ0SSkXJi1+t59wnZ7H7cOBRLN5BYrG1S9umfcfOHPMRD2L9xX5PkJ8LIHfzAQY88xWfLCmIQo1c3LGzRYPaNEw7swPypee04D+/HhBaGR6v7wzQyV4VaZBQKkpmrdkFEDxIeDU3uZuWkpPOHHM/NewsLOKnXa5VWqO5KJ+vfpMbveZ17D58gmMnT5dLW7fb1cwSaHtST3aMKMqoXxuA2inlnwC6tqhPn7aNg57/wrW9SPKoR6Bf8x9HdCane/OQ6jVt+Y6Q8sWaBgml4ox3n6b7fZJI2c3ZfZ8a8MxXjHhhHtsPFZEc5U+zd5PXLq/g1+/pOVz2UvkOaXcdS0oNx0+VDyC+ZLet/D4Rz12dxV+vzaJz83rBM/uQXq9WufcjrCAw4Yqe5dKHdcvgziEdee2XrhFWNYIEuLsmL2VFgT2TH52kQ2CVijPeayIt3XZmYtun1qimUmPK7RY3cMJX0alcAL6eLjbuLT+0NTnJFSVKSg13TV4ajWrRIDWFn52bWSG9XZM6IZ3vPZS2Q3pdNk8Y7Vqe+5MVZemPeKzA+ukd59O8QW1q1Uhm1N++rRBA3byDaGU5scy7BgmlosT9zTvYEHnvw9sOuDp6l3rMwH7w05Wc37GJndULi7H+K5cWwth/95PEvPx9ju6BEIrR57QImufe4WfTvEFtn8e8b8WtPWZkn9vmzBNQs/q1/AYJgJE9mvP5yl1B6xIKJ5Z51+YmpaIk1A/viVO+h7K+88PmstenSkpJjvEYVO+YEMrIf3egjHWACFU05rylptg3WsqJSXoaJJSKskAf45JSwz3/XhZSOUk2LhMRLmMq3kCPnAjexxAPcyua1KkZPFMIbLsd2/g7cWKOnjY3KRVHej85i8KiM0uEHz3p/8a7qzB2C8IZY+NNMtB1HCjz+/EXJ9SyGOFw4ufSIKFUHPEMEAAPfbrCT05sWxY8UpFsOvT6NxsdqEl4vIfCBhLqT3hOZuT7bO89Yl/TmxOxT5ublIqSSFoVdh+O37b7SG5Iq21cqTbW7Gol+jZ/n00laZ+EUlVCON/AnWjDt6NN3mB83pB6PDqTHYfsW3ZjaJdmtpUVjmuzWwOhB8JA+a7qU3H4rVM0SCiVyKw7vvfHuO9Ts/ndh0s57WM3uGitx/Ti9eeGfY6v29HRk6f5cpU9wzm7NK/HbTYtqBeujPq1guapEeLsxV/2b8vGp0dVtkohcaKnRYOEUjG258hJpizbwQwfY+XtjBGXZ7X0W+Zl1rFwNK/ve/6A940qlJnVvjRITYnqUiO+BNoxr0FqCvdccnbQMkQkaiPRTPBdZ8OmQUKpOHHkRHHwTGH61cB2Za8nXn8umyeMtqVcY1xPH3+9NqvCMe9hmPPW7Y3oGt7LYUTL5Fv7hxydh3RxbZNq9/arkdLmJqUSWLDbjvfIJqj8on3eu9fZqVGdmj6Xu/D2wuz8kPspWnrMbn7Ga20kt1n3XBhaBSM0oEPoM9mb1nUFsvM7RL50up00SChVBXh+jj07sZ/94qcKeSvbSOGrk7xWjcrP8A10K/K+5tpdR/j1B3khlXuBtU/F3UM7Ua92is88nTIiW6gvkN5tGvpMD3bPbdkwlXl/HMJ9IzrbXqdIlGiQUCpx+XoomLEicCfvNxE21YCr49eXD27px80XtGdE9wzAtT4RnOmzqKy9R09WmAR4oji0xvKBHZuyecJo7hkWvK3fTtdYo5ncwgnObZqkhdyJnYiq7k+mVAI4etL+fgi3//z6fJ/p7ZvW4eFLu5Wtbmp35/D+o6e43Gt106MnT/PQFP8TA93G9Gpla11C5f4VNEgt//QSHz0NsaUzrpWKujO3HieHuNapFfjj7X0DDKsqAe6eH+dV3HVu+6EiPliwNYwLRJd74cHh3VxPVx2buQJoh/TQlhOvyjRIKBUlvu7BTnYsQ+A29VYNUwHXtp4QXhOL52ieZvVqscfGpSX8adM4jfPD6FQOh/fopEvPaUHbJmn0bBX5cht2mXLnQMa+/H3Mrq9BQimH/Hf5DhZtOsCTY3uUS/e8cQf7tl9ZgZpLft63Da0apjK4s2sYZ7umkX1rjlaTzLz7hjhWtruZqZk1iU5EOCfTd2d2tPy8XxtyujenV+uGXHdeaz5cvC34SbFYu0lEWovI1yKyWkRWicjdVnpjEZklIvnW342sdBGRiSKyXkR+FJHeHmWNs/Lni8g4j/Q+IrLCOmeiWI2k/q6hVCL47eSlvL9gS9l7X23/sezvTEoShnRpVlav3wzpGLvKAGdn1A2eySEjujfn+WuyuHtodDvMA3n6Zz258GxXAPe38RG4tmd1Uij/i54G/mCM6Qb0B+4UkW7AeGCOMaYTMMd6DzAS6GT9uQ14FVw3fOBRoB/QF3jU46b/KnCrx3k5Vrq/ayiV0D77cQftxk8PaQ+GSFyTHf56QaGM0HlyTHeg/NNQXZuehmK5ereIcEXvTGrWiM+xPHf6CeAPje7KVX0y+aODQ3CD/kaMMTuNMUus10eANUArYAzwrpXtXWCs9XoM8J5xWQA0FJEWwAhgljHmgDHmIDALyLGO1TfGLDCuAdbveZXl6xpKJbQ3v90EwJb9xx29jp033topSXRpUb9C+lkRNlN505FE/nn2Xd09tBNZrV1NYWdZHesN03zPKbFDWGFTRNoB5wILgQxjzE7r0C4gw3rdCvBsPCuw0gKlF/hIJ8A1vOt1m4jkikju3r2RjytXKhoMlN29nVrSJ9mBgi8958w8Cs8bul0396q6EZDdfjmgbdnkvWj0m4QcJESkLvAf4HfGmHKLwltPAI7+Cwe6hjHmDWNMtjEmOz093clqKBUx923b817o1AJ2qSmuJqBefmYSR8IY3yOgItl8yPcF7CmmOnBPOnQvC+IWs1VgRSQFV4D4hzHmEyt5t9VUhPX3Hit9O+A5fTHTSguUnukjPdA1lEpsVnAo9rE8uJ3smkUNzi9iV11iRKRzL9wjsGr4eEoUOzfK9hLK6CYB3gbWGGOe9zg0DXCPUBoHTPVIv9Ea5dQfKLSajGYCw0WkkdVhPRyYaR07LCL9rWvd6FWWr2soldisb9/HT5U4Uny92vZ0Ji9/ZLjPdM+nB+8hvpFyooksnsx/4GLm/XEIb407L2jeFY9V/L2XWMvrRmvZcbdQ/k8aCPwSWCEiy6y0/wMmAP8WkZuBLcA11rEZwChgPXAc+BWAMeaAiDwJLLbyPWGMOWC9vgN4B0gFPrf+EOAaSiUcXy1Lp07b+ySx9skc3p+/hRvPb2tLeSk1hOeuzuLej5aDOfMzeH7rz2yUVunr/Hpwh7Ld4KqqFg1ckxcLjwdeimXdn0b6HGXlDhK+niScFDRIGGO+w/9kzKE+8hvgTj9lTQIm+UjPBSp8HTHG7Pd1DaUSmee38FM2NzfVTknmVht3cxOPhgxjpTjh/pwujpQbj5ICtN9ck+1/GG7bJmms3XUk6k9cOuNaqSjx1W78cW7FdY7iiQhktXYtTTGiewbdW9Ynu20jHrmsW4xrlrgC3eQDBe1EwTwAABfcSURBVMv3b+7Hiu2HAi717sQAMQ0SSkXJweOnKqRV9knijyM685eZFfehsIsIdGxWj/VPjSybbPexn9VlVWj8Leq48elRAfsb0uvV4uIuPmcB2LrNrbf4nF6oVBWUv+doxOeufTKHNo1dbf8Z9c8Me8zyGCd/QUffu6M1rVuLYd1831y8dW9Zn5vOb1f23v30U5X3S4g2f08S0e6QDpX+yysVZeG2CNx0fjtqpyQz6aZsbjq/HeNHnmmS8ByW+sEt/Xyen/vQJbx5Y3ZI15p+1yAeu7x72Xsnv6GCK4DNvXewsxeJMynJScz+vbNbsNpJg4RSURas3fjZK88p9/76vm0AV7PPY5d3Z6zHxjxOT1J2+rttt5b1I159NpF1bGb/FqxO0SChVBwZ3i2Da84rPxTUuxXCc5a2Ab6850I+v3uQI/Vxaka427CuzRwtv7pxYsKjdlwrFUcapdWskBboPt2nbSPbVmH1JdQQ0b1lfVbtOBw8o5df9LdnPkd152Qo1ycJpaLMYFheUBh6/gBfDp0MEBB6n0Sk27A6/aSiKk+DhFJRViPAbCpf98xYrmkU6k08knt9j1YVlx2vTuxcV8tJ2tykVJSFO2PW35NESnLifgv/fvzFNK1bsWmtOmkRYLe5eKJBQimHGWO8vpH7fzb4w3DXPgHnd2jCDxv2A763OF32yLC4Glcf7l4QrRqmOlSTBOLxz/e363rZUqQTo920uUkpBxw7eWZb0nA+uOn1XBPlstue2c69Q3rFvZ8bptWkfm3ndiMLl7+f8e6hnSqk3ThAO6s93ZfTmTEew5ojoTOulUowf5q+uuy19/1zRSid1tan/neXdEqIzt1SP0HinmFnV0gb3bOFw7VRdtLmJqUcsP/omXWaSo0hyeMm+th/V/s4o3rYPGF0rKsQN5zcKMhOGiSUcoDnkNCDx07x5erdYZ2fabXZt0yQtnvbtjBVcUeDhFIO8Bzl2vfpOXRpHt4yDFdnZ9Ksfi0uOjsx9mxv37QOa3cdiXU1EpKd8TVme1wrpcLj3Y+w41BR2OcP7twsIfojAP5ydVasq5Bw7PyndbLpSp8klHJAakr5jWES5WYfqbq1arD04WEUHCzispe+K3fstV/0JiU5iYz6iTEvQJWnQUIpB6R4TW6oDm32jerUpFGdihPkcnroaKZEps1NSjnA+8Ghqj9J+NK9ZfVedqOq0CChlANyNx8o976wqDhGNYmdCVecEzxTNebE1wYnnli1uUkpm73z/SbW7Y58q9JEl/fQJeTvOUrPzAaxrkr14eCDqgYJpWxUWmqqxGS5GXcNYuX20Jcz99Skbi2a1K0VPKNKCNrcpJSNJnyxNqz8N/Rrw1d/uMih2kSuW8v6FXbIU9WTPkkoZaOFmw4Ez+ThqZ/1dKgmKlHE+8g3DRJKhcHdAd0g1d4VWJ+5oifJ1XAEVHXmxD+3E/FGg4RSYch6/Eug/EJ1d01eys96t2JI52aU+lsONYjr+7axpX4qcfzPwPas2nGYG/pVful0J79eaJBQqpKmLd/BtOU7Yl0NlWCa1K3FO7/qG+tqBKUd10pF4Ku1u9m871jETw4AL1xrz25kSjlJnySUisD/vJMLwBe/GxRxGWPPrdxuZEpFgz5JKFUJOS98G+sqKOWooEFCRCaJyB4RWemR1lhEZolIvvV3IytdRGSiiKwXkR9FpLfHOeOs/PkiMs4jvY+IrLDOmSjWIjf+rqFUrOw/ejLWVVDKJyfXBgvlSeIdIMcrbTwwxxjTCZhjvQcYCXSy/twGvAquGz7wKNAP6As86nHTfxW41eO8nCDXUCom5qzZE+sqKBV1QYOEMWYe4D1DaAzwrvX6XWCsR/p7xmUB0FBEWgAjgFnGmAPGmIPALCDHOlbfGLPAuGaUvOdVlq9rKBV12w4cZ17+3kqX8/Cl3QB4+ee9g+RUKj5E2nGdYYzZab3eBWRYr1sB2zzyFVhpgdILfKQHuoZSFazaUcjmfccZfY79excs3nyAq1+bb0tZN1/QnpsvaG9LWUpFQ6VHNxljjIg4Oq882DVE5DZczVu0aaOTkqqj0RNdu6GNPmd0kJzhKSwq5vb382wtUymnODHjOtLRTbutpiKsv92NtdsBz1XBMq20QOmZPtIDXaMCY8wbxphsY0x2enpibByv4t/6PUfJevxL9h87FdH5E68/1+YaKeWbkzOuIw0S0wD3CKVxwFSP9ButUU79gUKryWgmMFxEGlkd1sOBmdaxwyLS3xrVdKNXWb6uoVQ563YfcaTcS57/plLnX57Vstz7567OqlR5SsVCKENgJwPzgc4iUiAiNwMTgGEikg9cYr0HmAFsBNYDbwJ3ABhjDgBPAoutP09YaVh53rLO2QB8bqX7u4ZS5eTH8QY/390/pOx11xb1YlgTpSITtE/CGHO9n0NDfeQ1wJ1+ypkETPKRngv08JG+39c1lPKWZOOz9tpdh7n5nVz++9sLbCkvs1Ea//n1+XycV0DX5rrns0o8OuNaJTzPeUTeaylNXbadBz5ZEXJZr87dwPZDRXyzLrw5EdcF2KCnT9tGPHNFT5LsjGZK+WDQPa6VqsBztumFf/mab+8bUpZ294fLANd+DYEUl5SSJMKMFa5R1/f8a3nQ6zZKS+HRy7pzfscmNKtXm47N6nJ+h6bc/O5i0uvp9p0qepzcikSDhEp4SR6fkIKDRXy/fj8XdGpKcUlpuXynS0opKi6hXu2KGwZ1evDzCmnB5D00rNzTwS2DzgJg/gPaSqqqDm1uUgnPuxXnRHEJAEXW3wCHTxRz70fL6fnYlzzwyY+Vut64AW0Z3Dldm49UtaBPEioufLFyJwPOakqDNNe3/FU7CunWon5EC5eVWjOKHplStiYlfZ+azYli15PF5EXb6JxRj5sGRjbz+bHLuzu6oJpS8USfJFTMFRw8zu0fLOGuD5cC8MOGfYye+B3vzd9SIe/fZuezed+xcmmnvTqrN+07Rrvx05my7Mxuce4A4fbYf1ezq/AE2w4c56K/fB20jsO7nVkVRgOEile6x7Wqktw38G0HjrNh71E27HHNe1i1o7BcvvzdR/jr7HX8dfY6Njw9iiRx3bB3FZ4ol2/O2tBGJvV/Zk5I+R6/vDsjujfny9W7Q8qvVLQ5+b1FnyRUzD0zYw0AG/cdY+j/+4aHp64CYNfh8vs3HCoqLnvd4f9m8M9FWwF4dNqqcvkWbfJetLhybhzQllo19KOiqif9P1/FnL9v/vPWlV+a27uf+MFPV3J2BKOSwvHAyC6ICI3q1HT0OkrFKw0SqtL+/v0m8raE/+3dGMOfv1gbxhkVn6lPeQ1ztVuH9LqOlq9UvNM+CVVpj/93NQCbJ4S3TPdfZv7Eq3M3BMxTUmpIth4hjBO9cgFMv+sCurdsENVrKlUZTnxC9ElCxcwrQQIEuPoe3MZNWuRkdcq0bZLGrHsurBAgnvpZD67o3crPWUrFjji4WLg+SaiY+GLlzuCZvBw7VRI8UyWlpiQz997BPoe53tCvLTf0a+t4HZSKJ/okoWx18nQJ7cZP58U5+X7zrCgo5PYPloRc5r0fBV9HyZ+Vj4/gMq99Hfy555KzWfNkjs6DUMqDBgllqyLr276/pqTComIue+m7sMr8OK+AldsLg2f0cknXZtStVYMXQ9ghrlaNJO4a2jHsayhV1WmQUJVyorh8E9BCa45CUbHvpqEpS7f7TA/m0hfDCywAv7m4U9nrp352ZsuSZ686p1y+56/J4qc/jdQnCJXwnBjcoX0SqlL+9/28cu/3HD5RIU+JtWxGcpL4nBnapXk91u6q/Bakj17WrWyk1as39KZX64Zlx37etw01koTLslqSVrMGI3s0p7ComMxGaZW+rlKxpjOuVdz6xmvC2zfr9pW9vuKV7wHo+dhMhjw3F4CpHuspudWrXfnvKqN6NudXA9vTt11jAIZ0aVbuuIhw7XltSKtZw7pmigYIpUKgQULZavaaM+sbLdl6iJJSw/FTJWw9cJxrXp9P3paDFc556ee9y15vemYUWZkNuKRrRoV8bt/eN4Q7BncA4Pq+bYAzi+69OS6bj28fQO2UZFt+HqWqO21uUrZZtu1QhTTPeQ6+1lRa/OAl5XZxExGm/uYCjDEs2nSAa99YUOGcurVq8McRnfnfizrwzbq9TF60tWyUeIPUFLKtpwmlVOVpkFARe37WunLvx778fdhl+NvmU0To277izX7KnQPL1lFqkJpC28auJqPebRqFfW2lqhqdca3iysQAcyHCdU12Jr3bNCyXJiK8f3PfsvepKcnlOqMBslo3ZM4fLuJXA9vZVhel1Bn6JKFi5u1x2WWvn70qy2eenq3OLI2x9JFhPvPoInxKOUeDhIqZi71GIPnSMK0mb96YTZKgndFKxYA2Nynb/e9FZ5H/1Ei/x0ef04K1YSx/MaxbBkMDjHZSSjlHnySU7R4Y2TXg8do1kvWpQCkb9WnbiInXn0szPwNBKkOfJJStptw5sOz1hWenlzt28wXtAWjdODWqdVKqqstslMblWS2pVzvF9rL1SUJFxHumNUC3FvXLjT5673/6MmfNbv7w0XIOHS/mlkHt6dW6ISN7NI9mVZVSlaBPEtXE8m2H2LL/mG3lfbqkoNz7r+8dzIy7B1XIN7RrBlecmwlAw9SaXJbVkhrJ+r+dUolCnySqgdMlpYyxJroteXgYja3JaJUxxWsNpvZN6/jN++Dortx9SSdSa2o/hFKJRoNEArrv4+U0rlOL8SO7hJR/5qoz6yn1fnIW4Jp/8OTYHhUmp/myq/AE/1lSwB2DOyAi7Dt6suzYsG4ZXJvdOuD5yUlCg1T720qVUs6TaG8u77Ts7GyTm5sb62pUyuETxaQkJfn85l1aajjLWg9p84TRPs8vLTUkJbmGlxpjaP/ADJ/5PH14W39OFJewZf9xHp22ilE9m7N06yF2FlZc+tuTvzoopRKLiOQZY7K90+P+SUJEcoC/AcnAW8aYCTGukq2MMRXmC5zz2JdkNkrlu/svrpD/xOkzm/l4BgO39XuOcsnz3/DkmO6s2XWEfy7cWnbsH7f044a3Fvqsx3VeC+nNWLEraN3H9AptW1ClVOKK6yAhIsnAy8AwoABYLCLTjDGrY1szeyzadIBrXp/PtN8M5JzM8s0+BQeLWFFQSM9M17IUx06eJiU5ieOnzgSJjfuOcuq0Yeqy7bw+b2O58x+euqrc+99e3JGBHZuyecJodhYW0TC1Ji/MWcffv9vMqZJSv3XMbJTKkM7NSKuVzOvfbCStZjJX9cnk1kFn0bqx7segVFUX181NIjIAeMwYM8J6/wCAMeYZf+dE2tz07BdrWbLVtdeBMR6rKRow1jt3erDfWalx5SkxhtJS98/i2qHNGNdrEWHNzsNl55ydURdBOF1ayoa9Z0YhZdSvxYFjpyguifzfacqdA0Pqe1BKVV+J2tzUCtjm8b4A6OedSURuA24DaNOmTUQXKim1buhW641gbQkoICSVbQ8oAkl+lpNwx46kJCHJyufZGiQiJItQYmVs0aA2X63dw+DO6dSq4RoWWiMpiZTkJNbuOkJW64Z0zqiLMfBRXgEN01Lo1Kwuize7gllW64as332Ebi3r07JhKv97YQfOzqhLcpJw/FQJdWrF+z+vUireVYm7iDHmDeANcD1JRFLGA6MCLyURa3+52vcqqf5ogFBK2SHeZzVtBzzHV2ZaaUoppaIg3oPEYqCTiLQXkZrAdcC0GNdJKaWqjbhukzDGnBaR3wAzcQ2BnWSMWRXkNKWUUjaJ6yABYIyZAQSfDaaUUsp28d7cpJRSKoY0SCillPJLg4RSSim/NEgopZTyK66X5YiEiOwFtsS6Hj40BfbFuhJhSsQ6g9Y72hKx3olYZ3C23m2NMeneiVUuSMQrEcn1tS5KPEvEOoPWO9oSsd6JWGeITb21uUkppZRfGiSUUkr5pUEiet6IdQUikIh1Bq13tCVivROxzhCDemufhFJKKb/0SUIppZRfGiSUUkr5pUEiQiLSWkS+FpHVIrJKRO620huLyCwRybf+bmSldxGR+SJyUkTu9SorR0R+EpH1IjI+Eertr5x4rrNHeckislREPnOqznbXW0QaisjHIrJWRNZYW/smQr3vscpYKSKTRaR2HNX7BhH5UURWiMgPIpLlUVZUPpN21dnRz6MxRv9E8AdoAfS2XtcD1gHdgGeB8Vb6eODP1utmwHnAU8C9HuUkAxuAs4CawHKgWwLU22c58Vxnj/J+D/wT+CwR/h+xjr0L3GK9rgk0jPd649p+eBOQar3/N3BTHNX7fKCR9XoksNB6HbXPpI11duzz6NgHpLr9AaYCw4CfgBYe/3A/eeV7zOuDNACY6fH+AeCBeK+3v3Livc64djecA1yMw0HCxv9HGlg3W4lmfW2ot3uP+sa4tiX4DBgeb/W20hsB263XMftMRlpnf+XYUSdtbrKBiLQDzgUWAhnGmJ3WoV1ARpDT3R8ktwIrzXGVrLe/chxlQ51fAO4DSp2onz+VrHd7YC/wd6uZ7C0RqeNUXT1Vpt7GmO3Ac8BWYCdQaIz50rHKeoig3jcDn1uvY/KZrGSd/ZVTaRokKklE6gL/AX5njDnsecy4QnpcjjG2q96ByrFbZessIpcCe4wxec7V0ud1K/u7rgH0Bl41xpwLHMPVBOEoG37fjYAxuIJcS6COiPzCoep6XjeseovIEFw33Pudrps/dtXZic+jBolKEJEUXP8g/zDGfGIl7xaRFtbxFsCeIMVsB1p7vM+00hxjU739leMIm+o8ELhcRDYDHwIXi8gHDlUZq1521LsAKDDGuL8ZfowraDjGpnpfAmwyxuw1xhQDn+BqU3dMuPUWkXOAt4Axxpj9VnJUP5M21dmxz6MGiQiJiABvA2uMMc97HJoGjLNej8PVNhjIYqCTiLQXkZrAdVYZjrCr3gHKsZ1ddTbGPGCMyTTGtMP1e/7KGOPYN1sb670L2CYina2kocBqm6tbxsb/t7cC/UUkzSpzKLDG7vq6hVtvEWmDK3D90hizziN/1D6TdtXZ0c9jNDpjquIf4AJcj4A/AsusP6OAJrg6RvOB2UBjK39zXN8IDwOHrNf1rWOjcI1G2AA8mAj19ldOPNfZq8zBOD+6yc7/R3oBuVZZU7BGuCRAvR8H1gIrgfeBWnFU77eAgx55cz3Kispn0q46O/l51GU5lFJK+aXNTUoppfzSIKGUUsovDRJKKaX80iChlFLKLw0SSiml/NIgoZRSyi8NEqraE9cy3HfEQT3mikh2kDy/E5G0aNVJKQ0SSkFDoEKQEJEaMahLML8DNEioqNEgoRRMADqIyDIRWSwi34rINKylL0RkiojkWZu53OY+SUSOishTIrJcRBaISIaVfrW4NtlZLiLz/F1URFJF5ENxbSL0KZDqcexVEcm1rvm4lXYXroXyvhaRr6204eLa8GeJiHxkLfCmlG10xrWq9qyllT8zxvQQkcHAdKCHMWaTdbyxMeaAiKTiWtfnImPMfhExwOXGmP+KyLPAYWPMn0RkBZBjjNkuIg2NMYf8XPf31nX+x1q0bQnQ3xiT63HNZFzLM9xljPnRWpww2xizT0Sa4lrHZ6Qx5piI3I9r2YsnHPtlqWpHnySUqmiRO0BY7hKR5cACXKuDdrLST+HaSAcgD2hnvf4eeEdEbsW1y5k/FwIfABhjfsS17o7bNSKyBFgKdMe1W5m3/lb69yKyDNdCcG1D+QGVClU8trkqFWvH3C+sJ4tLgAHGmOMiMhdw79NcbM48ipdgfZ6MMbeLSD9gNJAnIn2Mx5LOwYhIe+Be4DxjzEERecfjmuWyArOMMdeH88MpFQ59klAKjuDaF9iXBsBBK0B0wfXtPSAR6WCMWWiMeQTXjnKt/WSdB/zcOqcHcI6VXh9XoCq0+jlG+qnrAmCgiHS0yqgjImcHq59S4dAnCVXtWf0L34vISqAI2O1x+AvgdhFZg2vf4QUhFPkXEemE65v+HGC5n3yv4tqSdA2ufRbyrPosF5GluJbY3oar+crtDeALEdlhjBkiIjcBk0WklnX8IVxLXCtlC+24Vkop5Zc2NymllPJLm5uUcpiIjAD+7JW8yRjzs1jUR6lwaHOTUkopv7S5SSmllF8aJJRSSvmlQUIppZRfGiSUUkr59f8Bb01R7NIuTVUAAAAASUVORK5CYII=\n"
          },
          "metadata": {
            "needs_background": "light"
          }
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "As you can see, common table expressions (CTEs) let you shift a lot of your data cleaning into SQL. **That's an especially good thing in the case of BigQuery, because it is vastly faster than doing the work in Pandas.**"
      ],
      "metadata": {
        "id": "PZq7HuBsLQye"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "### Joining data\n",
        "\n",
        "When our data lives across different tables, how do we analyze it? By\n",
        "JOINing the tables together. A `JOIN` combines rows in the left table with\n",
        "corresponding rows in the right table, where the meaning of “corresponding” is based on how we specify the join.\n",
        "\n",
        "GitHub is the most popular place to collaborate on software projects. A GitHub **repository** (or **repo**) is a collection of files associated with a specific project. Most repos on GitHub are shared under a specific legal license, which determines the legal restrictions on how they are used.  **For our example, we're going to look at how many different files have been released under each license.** \n",
        "\n",
        "We'll work with two tables in the database.  The first table is the `licenses` table, which provides the name of each GitHub repo (in the `repo_name` column) and its corresponding license.  Here's a view of the first five rows."
      ],
      "metadata": {
        "id": "xkwgZ5D_NDae"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Construct a reference to the \"github_repos\" dataset\n",
        "dataset_ref = client.dataset(\"github_repos\", project=\"bigquery-public-data\")\n",
        "\n",
        "# API request - fetch the dataset\n",
        "dataset = client.get_dataset(dataset_ref)\n",
        "\n",
        "# Construct a reference to the \"licenses\" table\n",
        "licenses_ref = dataset_ref.table(\"licenses\")\n",
        "\n",
        "# API request - fetch the table\n",
        "licenses_table = client.get_table(licenses_ref)\n",
        "\n",
        "# Preview the first five lines of the \"licenses\" table\n",
        "client.list_rows(licenses_table, max_results=5).to_dataframe()"
      ],
      "metadata": {
        "id": "TYTZKEmlLUTY",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 197
        },
        "outputId": "868ecbdf-bf33-4690-84b9-2625c33351f1"
      },
      "execution_count": 42,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "                                   repo_name       license\n",
              "0                        nbstreet/batteryAce  artistic-2.0\n",
              "1  thecodersguild/wordpress-theming-workshop  artistic-2.0\n",
              "2           hyeon1219e/freezing-octo-dubstep  artistic-2.0\n",
              "3                                mfinc/mfinc  artistic-2.0\n",
              "4                        gitpan/Map-Tube-NYC  artistic-2.0"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-ce5e4b0e-dd9c-4642-a0e8-46bb0628e3ba\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>repo_name</th>\n",
              "      <th>license</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>nbstreet/batteryAce</td>\n",
              "      <td>artistic-2.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>thecodersguild/wordpress-theming-workshop</td>\n",
              "      <td>artistic-2.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>hyeon1219e/freezing-octo-dubstep</td>\n",
              "      <td>artistic-2.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>mfinc/mfinc</td>\n",
              "      <td>artistic-2.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>gitpan/Map-Tube-NYC</td>\n",
              "      <td>artistic-2.0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-ce5e4b0e-dd9c-4642-a0e8-46bb0628e3ba')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-ce5e4b0e-dd9c-4642-a0e8-46bb0628e3ba button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-ce5e4b0e-dd9c-4642-a0e8-46bb0628e3ba');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "application/vnd.google.colaboratory.module+javascript": "\n      import \"https://ssl.gstatic.com/colaboratory/data_table/f872b2c2305463fd/data_table.js\";\n\n      window.createDataTable({\n        data: [[{\n            'v': 0,\n            'f': \"0\",\n        },\n\"nbstreet/batteryAce\",\n\"artistic-2.0\"],\n [{\n            'v': 1,\n            'f': \"1\",\n        },\n\"thecodersguild/wordpress-theming-workshop\",\n\"artistic-2.0\"],\n [{\n            'v': 2,\n            'f': \"2\",\n        },\n\"hyeon1219e/freezing-octo-dubstep\",\n\"artistic-2.0\"],\n [{\n            'v': 3,\n            'f': \"3\",\n        },\n\"mfinc/mfinc\",\n\"artistic-2.0\"],\n [{\n            'v': 4,\n            'f': \"4\",\n        },\n\"gitpan/Map-Tube-NYC\",\n\"artistic-2.0\"]],\n        columns: [[\"number\", \"index\"], [\"string\", \"repo_name\"], [\"string\", \"license\"]],\n        columnOptions: [{\"width\": \"1px\", \"className\": \"index_column\"}],\n        rowsPerPage: 25,\n        helpUrl: \"https://colab.research.google.com/notebooks/data_table.ipynb\",\n        suppressOutputScrolling: true,\n        minimumWidth: undefined,\n      });\n    "
          },
          "metadata": {},
          "execution_count": 42
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "The second table is the `sample_files` table, which provides, among other information, the GitHub repo that each file belongs to (in the `repo_name` column).  The first several rows of this table are printed below."
      ],
      "metadata": {
        "id": "1Xxrz8BxNMsF"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Construct a reference to the \"sample_files\" table\n",
        "files_ref = dataset_ref.table(\"sample_files\")\n",
        "\n",
        "# API request - fetch the table\n",
        "files_table = client.get_table(files_ref)\n",
        "\n",
        "# Preview the first five lines of the \"sample_files\" table\n",
        "client.list_rows(files_table, max_results=5).to_dataframe()"
      ],
      "metadata": {
        "id": "gzyzOy7DNPti",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 197
        },
        "outputId": "b2ec39c1-620e-45c4-cf2c-7dd98ce3c01b"
      },
      "execution_count": 38,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "  repo_name                ref  \\\n",
              "0   git/git  refs/heads/master   \n",
              "1   np/ling  refs/heads/master   \n",
              "2   np/ling  refs/heads/master   \n",
              "3   np/ling  refs/heads/master   \n",
              "4   np/ling  refs/heads/master   \n",
              "\n",
              "                                                path   mode  \\\n",
              "0                                           RelNotes  40960   \n",
              "1       tests/success/plug_compose.t/plug_compose.ll  40960   \n",
              "2  fixtures/strict-par-success/parallel_assoc_lef...  40960   \n",
              "3  fixtures/sequence/parallel_assoc_2tensor2_left.ll  40960   \n",
              "4                        fixtures/success/my_dual.ll  40960   \n",
              "\n",
              "                                         id  \\\n",
              "0  62615ffa4e97803da96aefbc798ab50f949a8db7   \n",
              "1  0c1605e4b447158085656487dc477f7670c4bac1   \n",
              "2  b59bff84ec03d12fabd3b51a27ed7e39a180097e   \n",
              "3  f29523e3fb65702d99478e429eac6f801f32152b   \n",
              "4  38a3af095088f90dfc956cb990e893909c3ab286   \n",
              "\n",
              "                           symlink_target  \n",
              "0       Documentation/RelNotes/2.10.0.txt  \n",
              "1   ../../../fixtures/all/plug_compose.ll  \n",
              "2           ../all/parallel_assoc_left.ll  \n",
              "3  ../all/parallel_assoc_2tensor2_left.ll  \n",
              "4                       ../all/my_dual.ll  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-7e05122b-4222-4717-8d50-2a8cb24163cc\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>repo_name</th>\n",
              "      <th>ref</th>\n",
              "      <th>path</th>\n",
              "      <th>mode</th>\n",
              "      <th>id</th>\n",
              "      <th>symlink_target</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>git/git</td>\n",
              "      <td>refs/heads/master</td>\n",
              "      <td>RelNotes</td>\n",
              "      <td>40960</td>\n",
              "      <td>62615ffa4e97803da96aefbc798ab50f949a8db7</td>\n",
              "      <td>Documentation/RelNotes/2.10.0.txt</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>np/ling</td>\n",
              "      <td>refs/heads/master</td>\n",
              "      <td>tests/success/plug_compose.t/plug_compose.ll</td>\n",
              "      <td>40960</td>\n",
              "      <td>0c1605e4b447158085656487dc477f7670c4bac1</td>\n",
              "      <td>../../../fixtures/all/plug_compose.ll</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>np/ling</td>\n",
              "      <td>refs/heads/master</td>\n",
              "      <td>fixtures/strict-par-success/parallel_assoc_lef...</td>\n",
              "      <td>40960</td>\n",
              "      <td>b59bff84ec03d12fabd3b51a27ed7e39a180097e</td>\n",
              "      <td>../all/parallel_assoc_left.ll</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>np/ling</td>\n",
              "      <td>refs/heads/master</td>\n",
              "      <td>fixtures/sequence/parallel_assoc_2tensor2_left.ll</td>\n",
              "      <td>40960</td>\n",
              "      <td>f29523e3fb65702d99478e429eac6f801f32152b</td>\n",
              "      <td>../all/parallel_assoc_2tensor2_left.ll</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>np/ling</td>\n",
              "      <td>refs/heads/master</td>\n",
              "      <td>fixtures/success/my_dual.ll</td>\n",
              "      <td>40960</td>\n",
              "      <td>38a3af095088f90dfc956cb990e893909c3ab286</td>\n",
              "      <td>../all/my_dual.ll</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-7e05122b-4222-4717-8d50-2a8cb24163cc')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-7e05122b-4222-4717-8d50-2a8cb24163cc button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-7e05122b-4222-4717-8d50-2a8cb24163cc');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "application/vnd.google.colaboratory.module+javascript": "\n      import \"https://ssl.gstatic.com/colaboratory/data_table/f872b2c2305463fd/data_table.js\";\n\n      window.createDataTable({\n        data: [[{\n            'v': 0,\n            'f': \"0\",\n        },\n\"git/git\",\n\"refs/heads/master\",\n\"RelNotes\",\n{\n            'v': 40960,\n            'f': \"40960\",\n        },\n\"62615ffa4e97803da96aefbc798ab50f949a8db7\",\n\"Documentation/RelNotes/2.10.0.txt\"],\n [{\n            'v': 1,\n            'f': \"1\",\n        },\n\"np/ling\",\n\"refs/heads/master\",\n\"tests/success/plug_compose.t/plug_compose.ll\",\n{\n            'v': 40960,\n            'f': \"40960\",\n        },\n\"0c1605e4b447158085656487dc477f7670c4bac1\",\n\"../../../fixtures/all/plug_compose.ll\"],\n [{\n            'v': 2,\n            'f': \"2\",\n        },\n\"np/ling\",\n\"refs/heads/master\",\n\"fixtures/strict-par-success/parallel_assoc_left.ll\",\n{\n            'v': 40960,\n            'f': \"40960\",\n        },\n\"b59bff84ec03d12fabd3b51a27ed7e39a180097e\",\n\"../all/parallel_assoc_left.ll\"],\n [{\n            'v': 3,\n            'f': \"3\",\n        },\n\"np/ling\",\n\"refs/heads/master\",\n\"fixtures/sequence/parallel_assoc_2tensor2_left.ll\",\n{\n            'v': 40960,\n            'f': \"40960\",\n        },\n\"f29523e3fb65702d99478e429eac6f801f32152b\",\n\"../all/parallel_assoc_2tensor2_left.ll\"],\n [{\n            'v': 4,\n            'f': \"4\",\n        },\n\"np/ling\",\n\"refs/heads/master\",\n\"fixtures/success/my_dual.ll\",\n{\n            'v': 40960,\n            'f': \"40960\",\n        },\n\"38a3af095088f90dfc956cb990e893909c3ab286\",\n\"../all/my_dual.ll\"]],\n        columns: [[\"number\", \"index\"], [\"string\", \"repo_name\"], [\"string\", \"ref\"], [\"string\", \"path\"], [\"number\", \"mode\"], [\"string\", \"id\"], [\"string\", \"symlink_target\"]],\n        columnOptions: [{\"width\": \"1px\", \"className\": \"index_column\"}],\n        rowsPerPage: 25,\n        helpUrl: \"https://colab.research.google.com/notebooks/data_table.ipynb\",\n        suppressOutputScrolling: true,\n        minimumWidth: undefined,\n      });\n    "
          },
          "metadata": {},
          "execution_count": 38
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "Next, we write a query that uses information in both tables to determine how many files are released in each license."
      ],
      "metadata": {
        "id": "SIb2OHmnNV0H"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Query to determine the number of files per license, sorted by number of files\n",
        "query = \"\"\"\n",
        "        SELECT L.license, COUNT(1) AS number_of_files\n",
        "        FROM `bigquery-public-data.github_repos.sample_files` AS sf\n",
        "        INNER JOIN `bigquery-public-data.github_repos.licenses` AS L \n",
        "            ON sf.repo_name = L.repo_name\n",
        "        GROUP BY L.license\n",
        "        ORDER BY number_of_files DESC\n",
        "        \"\"\"\n",
        "\n",
        "# Set up the query (cancel the query if it would use too much of \n",
        "# your quota, with the limit set to 10 GB)\n",
        "safe_config = bigquery.QueryJobConfig(maximum_bytes_billed=10**10)\n",
        "query_job = client.query(query, job_config=safe_config)\n",
        "\n",
        "# API request - run the query, and convert the results to a pandas DataFrame\n",
        "file_count_by_license = query_job.to_dataframe()"
      ],
      "metadata": {
        "id": "3WWRJmdWNYR4"
      },
      "execution_count": 39,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "It's a big query, and so we'll investigate each piece separately.\n",
        "\n",
        "![](https://i.imgur.com/QeufD01.png)\n",
        "    \n",
        "We'll begin with the **JOIN** (highlighted in blue above).  This specifies the sources of data and how to join them. We use **ON** to specify that we combine the tables by matching the values in the `repo_name` columns in the tables.\n",
        "\n",
        "Next, we'll talk about **SELECT** and **GROUP BY** (highlighted in yellow).  The **GROUP BY** breaks the data into a different group for each license, before we **COUNT** the number of rows in the `sample_files` table that corresponds to each license.  (Remember that you can count the number of rows with `COUNT(1)`.) \n",
        "\n",
        "Finally, the **ORDER BY** (highlighted in purple) sorts the results so that licenses with more files appear first.\n",
        "\n",
        "It was a big query, but it gave us a nice table summarizing how many files have been committed under each license:  "
      ],
      "metadata": {
        "id": "WA0TW4FpNdQ9"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Print the DataFrame\n",
        "file_count_by_license"
      ],
      "metadata": {
        "id": "4hGOBl6rNfve",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 413
        },
        "outputId": "cbb2b05f-1e9a-4624-d877-7d00226d9053"
      },
      "execution_count": 40,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         license  number_of_files\n",
              "0            mit         20408848\n",
              "1        gpl-2.0         16440828\n",
              "2     apache-2.0          7114054\n",
              "3        gpl-3.0          4840103\n",
              "4   bsd-3-clause          3149733\n",
              "5       agpl-3.0          1321015\n",
              "6       lgpl-2.1           775792\n",
              "7   bsd-2-clause           687381\n",
              "8       lgpl-3.0           569941\n",
              "9        mpl-2.0           458331\n",
              "10       cc0-1.0           406823\n",
              "11       epl-1.0           312269\n",
              "12     unlicense           208494\n",
              "13  artistic-2.0           147904\n",
              "14           isc           118063"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-cec9e7f8-bf86-45cb-b468-df3bb881abb8\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>license</th>\n",
              "      <th>number_of_files</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>mit</td>\n",
              "      <td>20408848</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>gpl-2.0</td>\n",
              "      <td>16440828</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>apache-2.0</td>\n",
              "      <td>7114054</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>gpl-3.0</td>\n",
              "      <td>4840103</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>bsd-3-clause</td>\n",
              "      <td>3149733</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>5</th>\n",
              "      <td>agpl-3.0</td>\n",
              "      <td>1321015</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>6</th>\n",
              "      <td>lgpl-2.1</td>\n",
              "      <td>775792</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>7</th>\n",
              "      <td>bsd-2-clause</td>\n",
              "      <td>687381</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>8</th>\n",
              "      <td>lgpl-3.0</td>\n",
              "      <td>569941</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>9</th>\n",
              "      <td>mpl-2.0</td>\n",
              "      <td>458331</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>10</th>\n",
              "      <td>cc0-1.0</td>\n",
              "      <td>406823</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>11</th>\n",
              "      <td>epl-1.0</td>\n",
              "      <td>312269</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>12</th>\n",
              "      <td>unlicense</td>\n",
              "      <td>208494</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>13</th>\n",
              "      <td>artistic-2.0</td>\n",
              "      <td>147904</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>14</th>\n",
              "      <td>isc</td>\n",
              "      <td>118063</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-cec9e7f8-bf86-45cb-b468-df3bb881abb8')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-cec9e7f8-bf86-45cb-b468-df3bb881abb8 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-cec9e7f8-bf86-45cb-b468-df3bb881abb8');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "application/vnd.google.colaboratory.module+javascript": "\n      import \"https://ssl.gstatic.com/colaboratory/data_table/f872b2c2305463fd/data_table.js\";\n\n      window.createDataTable({\n        data: [[{\n            'v': 0,\n            'f': \"0\",\n        },\n\"mit\",\n{\n            'v': 20408848,\n            'f': \"20408848\",\n        }],\n [{\n            'v': 1,\n            'f': \"1\",\n        },\n\"gpl-2.0\",\n{\n            'v': 16440828,\n            'f': \"16440828\",\n        }],\n [{\n            'v': 2,\n            'f': \"2\",\n        },\n\"apache-2.0\",\n{\n            'v': 7114054,\n            'f': \"7114054\",\n        }],\n [{\n            'v': 3,\n            'f': \"3\",\n        },\n\"gpl-3.0\",\n{\n            'v': 4840103,\n            'f': \"4840103\",\n        }],\n [{\n            'v': 4,\n            'f': \"4\",\n        },\n\"bsd-3-clause\",\n{\n            'v': 3149733,\n            'f': \"3149733\",\n        }],\n [{\n            'v': 5,\n            'f': \"5\",\n        },\n\"agpl-3.0\",\n{\n            'v': 1321015,\n            'f': \"1321015\",\n        }],\n [{\n            'v': 6,\n            'f': \"6\",\n        },\n\"lgpl-2.1\",\n{\n            'v': 775792,\n            'f': \"775792\",\n        }],\n [{\n            'v': 7,\n            'f': \"7\",\n        },\n\"bsd-2-clause\",\n{\n            'v': 687381,\n            'f': \"687381\",\n        }],\n [{\n            'v': 8,\n            'f': \"8\",\n        },\n\"lgpl-3.0\",\n{\n            'v': 569941,\n            'f': \"569941\",\n        }],\n [{\n            'v': 9,\n            'f': \"9\",\n        },\n\"mpl-2.0\",\n{\n            'v': 458331,\n            'f': \"458331\",\n        }],\n [{\n            'v': 10,\n            'f': \"10\",\n        },\n\"cc0-1.0\",\n{\n            'v': 406823,\n            'f': \"406823\",\n        }],\n [{\n            'v': 11,\n            'f': \"11\",\n        },\n\"epl-1.0\",\n{\n            'v': 312269,\n            'f': \"312269\",\n        }],\n [{\n            'v': 12,\n            'f': \"12\",\n        },\n\"unlicense\",\n{\n            'v': 208494,\n            'f': \"208494\",\n        }],\n [{\n            'v': 13,\n            'f': \"13\",\n        },\n\"artistic-2.0\",\n{\n            'v': 147904,\n            'f': \"147904\",\n        }],\n [{\n            'v': 14,\n            'f': \"14\",\n        },\n\"isc\",\n{\n            'v': 118063,\n            'f': \"118063\",\n        }]],\n        columns: [[\"number\", \"index\"], [\"string\", \"license\"], [\"number\", \"number_of_files\"]],\n        columnOptions: [{\"width\": \"1px\", \"className\": \"index_column\"}],\n        rowsPerPage: 25,\n        helpUrl: \"https://colab.research.google.com/notebooks/data_table.ipynb\",\n        suppressOutputScrolling: true,\n        minimumWidth: undefined,\n      });\n    "
          },
          "metadata": {},
          "execution_count": 40
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "There are a few more types of JOIN, along with how to use UNIONs to pull information from multiple tables. We'll work with the [Hacker News](https://www.kaggle.com/hacker-news/hacker-news) dataset. We begin by reviewing the first several rows of the `comments` table."
      ],
      "metadata": {
        "id": "bSlj8eo5PZ85"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Construct a reference to the \"hacker_news\" dataset\n",
        "dataset_ref = client.dataset(\"hacker_news\", project=\"bigquery-public-data\")\n",
        "\n",
        "# API request - fetch the dataset\n",
        "dataset = client.get_dataset(dataset_ref)\n",
        "\n",
        "# Construct a reference to the \"comments\" table\n",
        "table_ref = dataset_ref.table(\"comments\")\n",
        "\n",
        "# API request - fetch the table\n",
        "table = client.get_table(table_ref)\n",
        "\n",
        "# Preview the first five lines of the table\n",
        "client.list_rows(table, max_results=5).to_dataframe()"
      ],
      "metadata": {
        "id": "ZC-UGEr3Paq0",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 346
        },
        "outputId": "b36c742d-ad93-496e-abb2-5bec82dfe7b0"
      },
      "execution_count": 44,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         id  by author        time                   time_ts  \\\n",
              "0   2701393  5l     5l  1309184881 2011-06-27 14:28:01+00:00   \n",
              "1   5811403  99     99  1370234048 2013-06-03 04:34:08+00:00   \n",
              "2     21623  AF     AF  1178992400 2007-05-12 17:53:20+00:00   \n",
              "3  10159727  EA     EA  1441206574 2015-09-02 15:09:34+00:00   \n",
              "4   2988424  Iv     Iv  1315853580 2011-09-12 18:53:00+00:00   \n",
              "\n",
              "                                                text    parent deleted  dead  \\\n",
              "0  And the glazier who fixed all the broken windo...   2701243    None  None   \n",
              "1  Does canada have the equivalent of H1B/Green c...   5804452    None  None   \n",
              "2  Speaking of Rails, there are other options in ...     21611    None  None   \n",
              "3  Humans and large livestock (and maybe even pet...  10159396    None  None   \n",
              "4  I must say I reacted in the same way when I re...   2988179    None  None   \n",
              "\n",
              "   ranking  \n",
              "0        0  \n",
              "1        0  \n",
              "2        0  \n",
              "3        0  \n",
              "4        0  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-2c4fd6ab-9db4-4f97-bfef-da3e80cc208a\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>id</th>\n",
              "      <th>by</th>\n",
              "      <th>author</th>\n",
              "      <th>time</th>\n",
              "      <th>time_ts</th>\n",
              "      <th>text</th>\n",
              "      <th>parent</th>\n",
              "      <th>deleted</th>\n",
              "      <th>dead</th>\n",
              "      <th>ranking</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>2701393</td>\n",
              "      <td>5l</td>\n",
              "      <td>5l</td>\n",
              "      <td>1309184881</td>\n",
              "      <td>2011-06-27 14:28:01+00:00</td>\n",
              "      <td>And the glazier who fixed all the broken windo...</td>\n",
              "      <td>2701243</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>5811403</td>\n",
              "      <td>99</td>\n",
              "      <td>99</td>\n",
              "      <td>1370234048</td>\n",
              "      <td>2013-06-03 04:34:08+00:00</td>\n",
              "      <td>Does canada have the equivalent of H1B/Green c...</td>\n",
              "      <td>5804452</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>21623</td>\n",
              "      <td>AF</td>\n",
              "      <td>AF</td>\n",
              "      <td>1178992400</td>\n",
              "      <td>2007-05-12 17:53:20+00:00</td>\n",
              "      <td>Speaking of Rails, there are other options in ...</td>\n",
              "      <td>21611</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>10159727</td>\n",
              "      <td>EA</td>\n",
              "      <td>EA</td>\n",
              "      <td>1441206574</td>\n",
              "      <td>2015-09-02 15:09:34+00:00</td>\n",
              "      <td>Humans and large livestock (and maybe even pet...</td>\n",
              "      <td>10159396</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>2988424</td>\n",
              "      <td>Iv</td>\n",
              "      <td>Iv</td>\n",
              "      <td>1315853580</td>\n",
              "      <td>2011-09-12 18:53:00+00:00</td>\n",
              "      <td>I must say I reacted in the same way when I re...</td>\n",
              "      <td>2988179</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-2c4fd6ab-9db4-4f97-bfef-da3e80cc208a')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-2c4fd6ab-9db4-4f97-bfef-da3e80cc208a button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-2c4fd6ab-9db4-4f97-bfef-da3e80cc208a');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "application/vnd.google.colaboratory.module+javascript": "\n      import \"https://ssl.gstatic.com/colaboratory/data_table/f872b2c2305463fd/data_table.js\";\n\n      window.createDataTable({\n        data: [[{\n            'v': 0,\n            'f': \"0\",\n        },\n{\n            'v': 2701393,\n            'f': \"2701393\",\n        },\n\"5l\",\n\"5l\",\n{\n            'v': 1309184881,\n            'f': \"1309184881\",\n        },\n\"2011-06-27 14:28:01+00:00\",\n\"And the glazier who fixed all the broken windows also left his money to good causes.\",\n{\n            'v': 2701243,\n            'f': \"2701243\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': 0,\n            'f': \"0\",\n        }],\n [{\n            'v': 1,\n            'f': \"1\",\n        },\n{\n            'v': 5811403,\n            'f': \"5811403\",\n        },\n\"99\",\n\"99\",\n{\n            'v': 1370234048,\n            'f': \"1370234048\",\n        },\n\"2013-06-03 04:34:08+00:00\",\n\"Does canada have the equivalent of H1B/Green card for work sponsorship? What do you think of that?\",\n{\n            'v': 5804452,\n            'f': \"5804452\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': 0,\n            'f': \"0\",\n        }],\n [{\n            'v': 2,\n            'f': \"2\",\n        },\n{\n            'v': 21623,\n            'f': \"21623\",\n        },\n\"AF\",\n\"AF\",\n{\n            'v': 1178992400,\n            'f': \"1178992400\",\n        },\n\"2007-05-12 17:53:20+00:00\",\n\"Speaking of Rails, there are other options in the Python world besides Django.<p>Pylons is a very Rails-y framework with the difference being that it is made to be easy to customize. In Rails if you don't like something you are going to have a hard time changing it out unless you are a good hacker. In Pylons that is easy, and you've got access to Python's vastly better platform (speed, Unicode support) and libraries.<p>If you are an absolute beginning programmer it might be kind of hard to pick up, but if you've programmed a bit or you've used one or two web frameworks (especially Rails) Pylons won't be hard to learn.<p><a href=\\\"http://pylonshq.com/\\\" rel=\\\"nofollow\\\">http://pylonshq.com/<\\/a>\",\n{\n            'v': 21611,\n            'f': \"21611\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': 0,\n            'f': \"0\",\n        }],\n [{\n            'v': 3,\n            'f': \"3\",\n        },\n{\n            'v': 10159727,\n            'f': \"10159727\",\n        },\n\"EA\",\n\"EA\",\n{\n            'v': 1441206574,\n            'f': \"1441206574\",\n        },\n\"2015-09-02 15:09:34+00:00\",\n\"Humans and large livestock (and maybe even pets) will have health monitoring devices embedded into their bodies in the near future.  The devices will save the insurance companies money.  Savings on insurance premiums will be the incentive to encourage mass adoption by citizens and owners of livestock.\",\n{\n            'v': 10159396,\n            'f': \"10159396\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': 0,\n            'f': \"0\",\n        }],\n [{\n            'v': 4,\n            'f': \"4\",\n        },\n{\n            'v': 2988424,\n            'f': \"2988424\",\n        },\n\"Iv\",\n\"Iv\",\n{\n            'v': 1315853580,\n            'f': \"1315853580\",\n        },\n\"2011-09-12 18:53:00+00:00\",\n\"I must say I reacted in the same way when I read about Madoff. The fact that people who are supposed to inspect investments would fall for such a scheme was one of the first nails that was put in the esteem I had for economy specialists.\",\n{\n            'v': 2988179,\n            'f': \"2988179\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': null,\n            'f': \"null\",\n        },\n{\n            'v': 0,\n            'f': \"0\",\n        }]],\n        columns: [[\"number\", \"index\"], [\"number\", \"id\"], [\"string\", \"by\"], [\"string\", \"author\"], [\"number\", \"time\"], [\"string\", \"time_ts\"], [\"string\", \"text\"], [\"number\", \"parent\"], [\"number\", \"deleted\"], [\"number\", \"dead\"], [\"number\", \"ranking\"]],\n        columnOptions: [{\"width\": \"1px\", \"className\": \"index_column\"}],\n        rowsPerPage: 25,\n        helpUrl: \"https://colab.research.google.com/notebooks/data_table.ipynb\",\n        suppressOutputScrolling: true,\n        minimumWidth: undefined,\n      });\n    "
          },
          "metadata": {},
          "execution_count": 44
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "# Construct a reference to the \"stories\" table\n",
        "table_ref = dataset_ref.table(\"stories\")\n",
        "\n",
        "# API request - fetch the table\n",
        "table = client.get_table(table_ref)\n",
        "\n",
        "# Preview the first five lines of the table\n",
        "client.list_rows(table, max_results=5).to_dataframe()"
      ],
      "metadata": {
        "id": "GG9M1WJYPgrg",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 495
        },
        "outputId": "8ecf0e6e-ee3c-43fd-a92d-614da5ef5dd5"
      },
      "execution_count": 45,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "        id            by  score        time                   time_ts  \\\n",
              "0  6940813     sarath237      0  1387536270 2013-12-20 10:44:30+00:00   \n",
              "1  6991401  123123321321      0  1388508751 2013-12-31 16:52:31+00:00   \n",
              "2  1531556           ssn      0  1279617234 2010-07-20 09:13:54+00:00   \n",
              "3  5012398          hoju      0  1357387877 2013-01-05 12:11:17+00:00   \n",
              "4  7214182         kogir      0  1401561740 2014-05-31 18:42:20+00:00   \n",
              "\n",
              "                                               title  \\\n",
              "0                            Sheryl Brindo Hot Pics    \n",
              "1  Are you people also put off by the culture of ...   \n",
              "2                     New UI for Google Image Search   \n",
              "3                       Historic website screenshots   \n",
              "4                                        Placeholder   \n",
              "\n",
              "                                                 url  \\\n",
              "0         http://www.youtube.com/watch?v=ym1cyxneB0Y   \n",
              "1                                                      \n",
              "2  http://googlesystem.blogspot.com/2010/07/googl...   \n",
              "3  http://webscraping.com/blog/Generate-website-s...   \n",
              "4                                                      \n",
              "\n",
              "                                                text deleted  dead  \\\n",
              "0                             Sheryl Brindo Hot Pics    None  True   \n",
              "1  They&#x27;re pretty explicitly &#x27;startup f...    None  True   \n",
              "2                    Again following on Bing's lead.    None  None   \n",
              "3  Python script to generate historic screenshots...    None  None   \n",
              "4                                      Mind the gap.    None  None   \n",
              "\n",
              "   descendants        author  \n",
              "0          NaN     sarath237  \n",
              "1          NaN  123123321321  \n",
              "2          0.0           ssn  \n",
              "3          0.0          hoju  \n",
              "4          0.0         kogir  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-04bf4669-5ac6-4766-8383-56bfd63e92c1\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>id</th>\n",
              "      <th>by</th>\n",
              "      <th>score</th>\n",
              "      <th>time</th>\n",
              "      <th>time_ts</th>\n",
              "      <th>title</th>\n",
              "      <th>url</th>\n",
              "      <th>text</th>\n",
              "      <th>deleted</th>\n",
              "      <th>dead</th>\n",
              "      <th>descendants</th>\n",
              "      <th>author</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>6940813</td>\n",
              "      <td>sarath237</td>\n",
              "      <td>0</td>\n",
              "      <td>1387536270</td>\n",
              "      <td>2013-12-20 10:44:30+00:00</td>\n",
              "      <td>Sheryl Brindo Hot Pics</td>\n",
              "      <td>http://www.youtube.com/watch?v=ym1cyxneB0Y</td>\n",
              "      <td>Sheryl Brindo Hot Pics</td>\n",
              "      <td>None</td>\n",
              "      <td>True</td>\n",
              "      <td>NaN</td>\n",
              "      <td>sarath237</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>6991401</td>\n",
              "      <td>123123321321</td>\n",
              "      <td>0</td>\n",
              "      <td>1388508751</td>\n",
              "      <td>2013-12-31 16:52:31+00:00</td>\n",
              "      <td>Are you people also put off by the culture of ...</td>\n",
              "      <td></td>\n",
              "      <td>They&amp;#x27;re pretty explicitly &amp;#x27;startup f...</td>\n",
              "      <td>None</td>\n",
              "      <td>True</td>\n",
              "      <td>NaN</td>\n",
              "      <td>123123321321</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>1531556</td>\n",
              "      <td>ssn</td>\n",
              "      <td>0</td>\n",
              "      <td>1279617234</td>\n",
              "      <td>2010-07-20 09:13:54+00:00</td>\n",
              "      <td>New UI for Google Image Search</td>\n",
              "      <td>http://googlesystem.blogspot.com/2010/07/googl...</td>\n",
              "      <td>Again following on Bing's lead.</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>0.0</td>\n",
              "      <td>ssn</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>5012398</td>\n",
              "      <td>hoju</td>\n",
              "      <td>0</td>\n",
              "      <td>1357387877</td>\n",
              "      <td>2013-01-05 12:11:17+00:00</td>\n",
              "      <td>Historic website screenshots</td>\n",
              "      <td>http://webscraping.com/blog/Generate-website-s...</td>\n",
              "      <td>Python script to generate historic screenshots...</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>0.0</td>\n",
              "      <td>hoju</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>7214182</td>\n",
              "      <td>kogir</td>\n",
              "      <td>0</td>\n",
              "      <td>1401561740</td>\n",
              "      <td>2014-05-31 18:42:20+00:00</td>\n",
              "      <td>Placeholder</td>\n",
              "      <td></td>\n",
              "      <td>Mind the gap.</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>0.0</td>\n",
              "      <td>kogir</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-04bf4669-5ac6-4766-8383-56bfd63e92c1')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-04bf4669-5ac6-4766-8383-56bfd63e92c1 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-04bf4669-5ac6-4766-8383-56bfd63e92c1');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "application/vnd.google.colaboratory.module+javascript": "\n      import \"https://ssl.gstatic.com/colaboratory/data_table/f872b2c2305463fd/data_table.js\";\n\n      window.createDataTable({\n        data: [[{\n            'v': 0,\n            'f': \"0\",\n        },\n{\n            'v': 6940813,\n            'f': \"6940813\",\n        },\n\"sarath237\",\n{\n            'v': 0,\n            'f': \"0\",\n        },\n{\n            'v': 1387536270,\n            'f': \"1387536270\",\n        },\n\"2013-12-20 10:44:30+00:00\",\n\" Sheryl Brindo Hot Pics \",\n\"http://www.youtube.com/watch?v=ym1cyxneB0Y\",\n\" Sheryl Brindo Hot Pics\",\n{\n            'v': null,\n            'f': \"null\",\n        },\ntrue,\n{\n            'v': NaN,\n            'f': \"NaN\",\n        },\n\"sarath237\"],\n [{\n            'v': 1,\n            'f': \"1\",\n        },\n{\n            'v': 6991401,\n            'f': \"6991401\",\n        },\n\"123123321321\",\n{\n            'v': 0,\n            'f': \"0\",\n        },\n{\n            'v': 1388508751,\n            'f': \"1388508751\",\n        },\n\"2013-12-31 16:52:31+00:00\",\n\"Are you people also put off by the culture of startup incubators?\",\n\"\",\n\"They&#x27;re pretty explicitly &#x27;startup factories&#x27; where the already-wealthy can capitalize on up-and-coming products and services. They take something that&#x27;s appealing to people because of the freedom and wealth it provides and then turn it into a way to capitalize on the labour and ideas of others (even if the ideas themselves aren&#x27;t necessarily that special).<p>Then on top of that, people have to put up an act to fit in to the culture, making it basically like work.<p>Ultimately they&#x27;re a very useful service and all, but they are also what I just described. Is it something we just have to live with? What are your thoughts?\",\n{\n            'v': null,\n            'f': \"null\",\n        },\ntrue,\n{\n            'v': NaN,\n            'f': \"NaN\",\n        },\n\"123123321321\"],\n [{\n            'v': 2,\n            'f': \"2\",\n        },\n{\n            'v': 1531556,\n            'f': \"1531556\",\n        },\n\"ssn\",\n{\n            'v': 0,\n            'f': \"0\",\n        },\n{\n            'v': 1279617234,\n            'f': \"1279617234\",\n        },\n\"2010-07-20 09:13:54+00:00\",\n\"New UI for Google Image Search\",\n\"http://googlesystem.blogspot.com/2010/07/google-tests-new-image-search-interface.html\",\n\"Again following on Bing's lead.\",\n{\n            'v': null,\n            'f': \"null\",\n        },\nnull,\n{\n            'v': 0.0,\n            'f': \"0.0\",\n        },\n\"ssn\"],\n [{\n            'v': 3,\n            'f': \"3\",\n        },\n{\n            'v': 5012398,\n            'f': \"5012398\",\n        },\n\"hoju\",\n{\n            'v': 0,\n            'f': \"0\",\n        },\n{\n            'v': 1357387877,\n            'f': \"1357387877\",\n        },\n\"2013-01-05 12:11:17+00:00\",\n\"Historic website screenshots\",\n\"http://webscraping.com/blog/Generate-website-screenshot-history/\",\n\"Python script to generate historic screenshots of a website.\",\n{\n            'v': null,\n            'f': \"null\",\n        },\nnull,\n{\n            'v': 0.0,\n            'f': \"0.0\",\n        },\n\"hoju\"],\n [{\n            'v': 4,\n            'f': \"4\",\n        },\n{\n            'v': 7214182,\n            'f': \"7214182\",\n        },\n\"kogir\",\n{\n            'v': 0,\n            'f': \"0\",\n        },\n{\n            'v': 1401561740,\n            'f': \"1401561740\",\n        },\n\"2014-05-31 18:42:20+00:00\",\n\"Placeholder\",\n\"\",\n\"Mind the gap.\",\n{\n            'v': null,\n            'f': \"null\",\n        },\nnull,\n{\n            'v': 0.0,\n            'f': \"0.0\",\n        },\n\"kogir\"]],\n        columns: [[\"number\", \"index\"], [\"number\", \"id\"], [\"string\", \"by\"], [\"number\", \"score\"], [\"number\", \"time\"], [\"string\", \"time_ts\"], [\"string\", \"title\"], [\"string\", \"url\"], [\"string\", \"text\"], [\"number\", \"deleted\"], [\"string\", \"dead\"], [\"number\", \"descendants\"], [\"string\", \"author\"]],\n        columnOptions: [{\"width\": \"1px\", \"className\": \"index_column\"}],\n        rowsPerPage: 25,\n        helpUrl: \"https://colab.research.google.com/notebooks/data_table.ipynb\",\n        suppressOutputScrolling: true,\n        minimumWidth: undefined,\n      });\n    "
          },
          "metadata": {},
          "execution_count": 45
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "The query below pulls information from the `stories` and `comments` tables to create a table showing all stories posted on January 1, 2012, along with the corresponding number of comments.  We use a **LEFT JOIN** so that the results include stories that didn't receive any comments."
      ],
      "metadata": {
        "id": "w-36FLhDPd2K"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Query to select all stories posted on January 1, 2012, with number of comments\n",
        "join_query = \"\"\"\n",
        "             WITH c AS\n",
        "             (\n",
        "             SELECT parent, COUNT(*) as num_comments\n",
        "             FROM `bigquery-public-data.hacker_news.comments` \n",
        "             GROUP BY parent\n",
        "             )\n",
        "             SELECT s.id as story_id, s.by, s.title, c.num_comments\n",
        "             FROM `bigquery-public-data.hacker_news.stories` AS s\n",
        "             LEFT JOIN c\n",
        "             ON s.id = c.parent\n",
        "             WHERE EXTRACT(DATE FROM s.time_ts) = '2012-01-01'\n",
        "             ORDER BY c.num_comments DESC\n",
        "             \"\"\"\n",
        "\n",
        "# Run the query, and return a pandas DataFrame\n",
        "join_result = client.query(join_query).result().to_dataframe()\n",
        "join_result.head()"
      ],
      "metadata": {
        "id": "WU9bl9oYPlUj",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 197
        },
        "outputId": "f953b5e3-cb52-4565-9569-0726252c88f6"
      },
      "execution_count": 46,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "   story_id           by                                              title  \\\n",
              "0   3412900  whoishiring              Ask HN: Who is Hiring? (January 2012)   \n",
              "1   3412901  whoishiring  Ask HN: Freelancer? Seeking freelancer? (Janua...   \n",
              "2   3412643     jemeshsu                                       Avoid Apress   \n",
              "3   3414012    ramanujam  Impress.js - a Prezi like implementation using...   \n",
              "4   3412891   Brajeshwar  There's no shame in code that is simply \"good ...   \n",
              "\n",
              "   num_comments  \n",
              "0         154.0  \n",
              "1          97.0  \n",
              "2          30.0  \n",
              "3          27.0  \n",
              "4          27.0  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-3b4a3ac7-629c-4141-b708-b9e276e85bc2\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>story_id</th>\n",
              "      <th>by</th>\n",
              "      <th>title</th>\n",
              "      <th>num_comments</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>3412900</td>\n",
              "      <td>whoishiring</td>\n",
              "      <td>Ask HN: Who is Hiring? (January 2012)</td>\n",
              "      <td>154.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>3412901</td>\n",
              "      <td>whoishiring</td>\n",
              "      <td>Ask HN: Freelancer? Seeking freelancer? (Janua...</td>\n",
              "      <td>97.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>3412643</td>\n",
              "      <td>jemeshsu</td>\n",
              "      <td>Avoid Apress</td>\n",
              "      <td>30.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>3414012</td>\n",
              "      <td>ramanujam</td>\n",
              "      <td>Impress.js - a Prezi like implementation using...</td>\n",
              "      <td>27.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>3412891</td>\n",
              "      <td>Brajeshwar</td>\n",
              "      <td>There's no shame in code that is simply \"good ...</td>\n",
              "      <td>27.0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-3b4a3ac7-629c-4141-b708-b9e276e85bc2')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-3b4a3ac7-629c-4141-b708-b9e276e85bc2 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-3b4a3ac7-629c-4141-b708-b9e276e85bc2');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "application/vnd.google.colaboratory.module+javascript": "\n      import \"https://ssl.gstatic.com/colaboratory/data_table/f872b2c2305463fd/data_table.js\";\n\n      window.createDataTable({\n        data: [[{\n            'v': 0,\n            'f': \"0\",\n        },\n{\n            'v': 3412900,\n            'f': \"3412900\",\n        },\n\"whoishiring\",\n\"Ask HN: Who is Hiring? (January 2012)\",\n{\n            'v': 154.0,\n            'f': \"154.0\",\n        }],\n [{\n            'v': 1,\n            'f': \"1\",\n        },\n{\n            'v': 3412901,\n            'f': \"3412901\",\n        },\n\"whoishiring\",\n\"Ask HN: Freelancer? Seeking freelancer? (January 2012)\",\n{\n            'v': 97.0,\n            'f': \"97.0\",\n        }],\n [{\n            'v': 2,\n            'f': \"2\",\n        },\n{\n            'v': 3412643,\n            'f': \"3412643\",\n        },\n\"jemeshsu\",\n\"Avoid Apress\",\n{\n            'v': 30.0,\n            'f': \"30.0\",\n        }],\n [{\n            'v': 3,\n            'f': \"3\",\n        },\n{\n            'v': 3414012,\n            'f': \"3414012\",\n        },\n\"ramanujam\",\n\"Impress.js - a Prezi like implementation using CSS3 3D transformations\",\n{\n            'v': 27.0,\n            'f': \"27.0\",\n        }],\n [{\n            'v': 4,\n            'f': \"4\",\n        },\n{\n            'v': 3412891,\n            'f': \"3412891\",\n        },\n\"Brajeshwar\",\n\"There's no shame in code that is simply \\\"good enough\\\"\",\n{\n            'v': 27.0,\n            'f': \"27.0\",\n        }]],\n        columns: [[\"number\", \"index\"], [\"number\", \"story_id\"], [\"string\", \"by\"], [\"string\", \"title\"], [\"number\", \"num_comments\"]],\n        columnOptions: [{\"width\": \"1px\", \"className\": \"index_column\"}],\n        rowsPerPage: 25,\n        helpUrl: \"https://colab.research.google.com/notebooks/data_table.ipynb\",\n        suppressOutputScrolling: true,\n        minimumWidth: undefined,\n      });\n    "
          },
          "metadata": {},
          "execution_count": 46
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "Since the results are ordered by the `num_comments` column, stories without comments appear at the end of the DataFrame.  (Remember that **NaN** stands for \"not a number\".)"
      ],
      "metadata": {
        "id": "s3F7dqryPnua"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# None of these stories received any comments\n",
        "join_result.tail()"
      ],
      "metadata": {
        "id": "EbbScrGpPqTy",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 197
        },
        "outputId": "31d2e733-a524-4c1c-d96e-5fb5ff7adb43"
      },
      "execution_count": 47,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "     story_id               by  \\\n",
              "439   3412721           kooljp   \n",
              "440   3413606       willvarfar   \n",
              "441   3413159  see_cloudtweaks   \n",
              "442   3412972          abionic   \n",
              "443   3412388       deviceguru   \n",
              "\n",
              "                                                 title  num_comments  \n",
              "439  Carolina Panthers vs New Orleans Saints Live S...           NaN  \n",
              "440  Poll: what is your (Lipson-Shiu) personality t...           NaN  \n",
              "441      IBM Cloud Computing: Overview - United States           NaN  \n",
              "442  Is SPLUNK eating up your disk space, might be ...           NaN  \n",
              "443                      Google TV 2.0 screenshot tour           NaN  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-6dd5dd5e-c5fe-47d5-9be1-5117840e7b9d\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>story_id</th>\n",
              "      <th>by</th>\n",
              "      <th>title</th>\n",
              "      <th>num_comments</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>439</th>\n",
              "      <td>3412721</td>\n",
              "      <td>kooljp</td>\n",
              "      <td>Carolina Panthers vs New Orleans Saints Live S...</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>440</th>\n",
              "      <td>3413606</td>\n",
              "      <td>willvarfar</td>\n",
              "      <td>Poll: what is your (Lipson-Shiu) personality t...</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>441</th>\n",
              "      <td>3413159</td>\n",
              "      <td>see_cloudtweaks</td>\n",
              "      <td>IBM Cloud Computing: Overview - United States</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>442</th>\n",
              "      <td>3412972</td>\n",
              "      <td>abionic</td>\n",
              "      <td>Is SPLUNK eating up your disk space, might be ...</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>443</th>\n",
              "      <td>3412388</td>\n",
              "      <td>deviceguru</td>\n",
              "      <td>Google TV 2.0 screenshot tour</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-6dd5dd5e-c5fe-47d5-9be1-5117840e7b9d')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-6dd5dd5e-c5fe-47d5-9be1-5117840e7b9d button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-6dd5dd5e-c5fe-47d5-9be1-5117840e7b9d');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "application/vnd.google.colaboratory.module+javascript": "\n      import \"https://ssl.gstatic.com/colaboratory/data_table/f872b2c2305463fd/data_table.js\";\n\n      window.createDataTable({\n        data: [[{\n            'v': 439,\n            'f': \"439\",\n        },\n{\n            'v': 3412721,\n            'f': \"3412721\",\n        },\n\"kooljp\",\n\"Carolina Panthers vs New Orleans Saints Live Stream NFL \",\n{\n            'v': NaN,\n            'f': \"NaN\",\n        }],\n [{\n            'v': 440,\n            'f': \"440\",\n        },\n{\n            'v': 3413606,\n            'f': \"3413606\",\n        },\n\"willvarfar\",\n\"Poll: what is your (Lipson-Shiu) personality type?\",\n{\n            'v': NaN,\n            'f': \"NaN\",\n        }],\n [{\n            'v': 441,\n            'f': \"441\",\n        },\n{\n            'v': 3413159,\n            'f': \"3413159\",\n        },\n\"see_cloudtweaks\",\n\"IBM Cloud Computing: Overview - United States\",\n{\n            'v': NaN,\n            'f': \"NaN\",\n        }],\n [{\n            'v': 442,\n            'f': \"442\",\n        },\n{\n            'v': 3412972,\n            'f': \"3412972\",\n        },\n\"abionic\",\n\"Is SPLUNK eating up your disk space, might be index size\",\n{\n            'v': NaN,\n            'f': \"NaN\",\n        }],\n [{\n            'v': 443,\n            'f': \"443\",\n        },\n{\n            'v': 3412388,\n            'f': \"3412388\",\n        },\n\"deviceguru\",\n\"Google TV 2.0 screenshot tour\",\n{\n            'v': NaN,\n            'f': \"NaN\",\n        }]],\n        columns: [[\"number\", \"index\"], [\"number\", \"story_id\"], [\"string\", \"by\"], [\"string\", \"title\"], [\"number\", \"num_comments\"]],\n        columnOptions: [{\"width\": \"1px\", \"className\": \"index_column\"}],\n        rowsPerPage: 25,\n        helpUrl: \"https://colab.research.google.com/notebooks/data_table.ipynb\",\n        suppressOutputScrolling: true,\n        minimumWidth: undefined,\n      });\n    "
          },
          "metadata": {},
          "execution_count": 47
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "As you've seen, JOINs horizontally combine results from different tables. If you instead would like to vertically concatenate columns, you can do so with a `UNION`. \n",
        "\n",
        "Next, we write a query to select all usernames corresponding to users who wrote stories or comments on January 1, 2014.  We use **UNION DISTINCT** (instead of **UNION ALL**) to ensure that each user appears in the table at most once."
      ],
      "metadata": {
        "id": "nZ65-rWPPsPf"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Query to select all users who posted stories or comments on January 1, 2014\n",
        "union_query = \"\"\"\n",
        "              SELECT c.by\n",
        "              FROM `bigquery-public-data.hacker_news.comments` AS c\n",
        "              WHERE EXTRACT(DATE FROM c.time_ts) = '2014-01-01'\n",
        "              UNION DISTINCT\n",
        "              SELECT s.by\n",
        "              FROM `bigquery-public-data.hacker_news.stories` AS s\n",
        "              WHERE EXTRACT(DATE FROM s.time_ts) = '2014-01-01'\n",
        "              \"\"\"\n",
        "\n",
        "# Run the query, and return a pandas DataFrame\n",
        "union_result = client.query(union_query).result().to_dataframe()\n",
        "union_result.head()"
      ],
      "metadata": {
        "id": "W7SueEUcPt5P",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 197
        },
        "outputId": "14eb4514-614e-4b88-b60e-f200412fe994"
      },
      "execution_count": 48,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         by\n",
              "0   kawsper\n",
              "1   mayrund\n",
              "2  webmaven\n",
              "3     kmfrk\n",
              "4    rbobby"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-f325571e-dcd6-497e-a91f-47c0de6d687f\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>by</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>kawsper</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>mayrund</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>webmaven</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>kmfrk</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>rbobby</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-f325571e-dcd6-497e-a91f-47c0de6d687f')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-f325571e-dcd6-497e-a91f-47c0de6d687f button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-f325571e-dcd6-497e-a91f-47c0de6d687f');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "application/vnd.google.colaboratory.module+javascript": "\n      import \"https://ssl.gstatic.com/colaboratory/data_table/f872b2c2305463fd/data_table.js\";\n\n      window.createDataTable({\n        data: [[{\n            'v': 0,\n            'f': \"0\",\n        },\n\"kawsper\"],\n [{\n            'v': 1,\n            'f': \"1\",\n        },\n\"mayrund\"],\n [{\n            'v': 2,\n            'f': \"2\",\n        },\n\"webmaven\"],\n [{\n            'v': 3,\n            'f': \"3\",\n        },\n\"kmfrk\"],\n [{\n            'v': 4,\n            'f': \"4\",\n        },\n\"rbobby\"]],\n        columns: [[\"number\", \"index\"], [\"string\", \"by\"]],\n        columnOptions: [{\"width\": \"1px\", \"className\": \"index_column\"}],\n        rowsPerPage: 25,\n        helpUrl: \"https://colab.research.google.com/notebooks/data_table.ipynb\",\n        suppressOutputScrolling: true,\n        minimumWidth: undefined,\n      });\n    "
          },
          "metadata": {},
          "execution_count": 48
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "To get the number of users who posted on January 1, 2014, we need only take the length of the DataFrame."
      ],
      "metadata": {
        "id": "pHWr22nyPwL4"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Number of users who posted stories or comments on January 1, 2014\n",
        "len(union_result)"
      ],
      "metadata": {
        "id": "fXlfo6uOPyWw",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "8bc18318-2b55-4e8d-d205-e5b4288df199"
      },
      "execution_count": 49,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "2282"
            ]
          },
          "metadata": {},
          "execution_count": 49
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "### Analytic Function\n",
        "\n",
        "You can also define analytic functions, which also operate on a set of rows like aggregation function. However, unlike aggregate functions, analytic functions return a (potentially different) value for each row in the original table. Analytic functions allow us to perform complex calculations with relatively straightforward syntax. For instance, we can quickly calculate moving averages and running totals, among other quantities.\n",
        "\n",
        "We'll work with the [San Francisco Open Data](https://www.kaggle.com/datasf/san-francisco) dataset."
      ],
      "metadata": {
        "id": "pWi_mgaHRr4d"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Construct a reference to the \"san_francisco\" dataset\n",
        "dataset_ref = client.dataset(\"san_francisco\", project=\"bigquery-public-data\")\n",
        "\n",
        "# API request - fetch the dataset\n",
        "dataset = client.get_dataset(dataset_ref)\n",
        "\n",
        "# Construct a reference to the \"bikeshare_trips\" table\n",
        "table_ref = dataset_ref.table(\"bikeshare_trips\")\n",
        "\n",
        "# API request - fetch the table\n",
        "table = client.get_table(table_ref)\n",
        "\n",
        "# Preview the first five lines of the table\n",
        "client.list_rows(table, max_results=5).to_dataframe()"
      ],
      "metadata": {
        "id": "toMZH_UbRvHH",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 197
        },
        "outputId": "8ed30442-ab7f-45cd-d630-f27bc1969a47"
      },
      "execution_count": 50,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "   trip_id  duration_sec                start_date start_station_name  \\\n",
              "0   944732          2618 2015-09-24 17:22:00+00:00              Mezes   \n",
              "1   984595          5957 2015-10-25 18:12:00+00:00              Mezes   \n",
              "2   984596          5913 2015-10-25 18:13:00+00:00              Mezes   \n",
              "3  1129385          6079 2016-03-18 10:33:00+00:00              Mezes   \n",
              "4  1030383          5780 2015-12-06 10:52:00+00:00              Mezes   \n",
              "\n",
              "   start_station_id                  end_date end_station_name  \\\n",
              "0                83 2015-09-24 18:06:00+00:00            Mezes   \n",
              "1                83 2015-10-25 19:51:00+00:00            Mezes   \n",
              "2                83 2015-10-25 19:51:00+00:00            Mezes   \n",
              "3                83 2016-03-18 12:14:00+00:00            Mezes   \n",
              "4                83 2015-12-06 12:28:00+00:00            Mezes   \n",
              "\n",
              "   end_station_id  bike_number zip_code subscriber_type  \n",
              "0              83          653    94063        Customer  \n",
              "1              83           52      nil        Customer  \n",
              "2              83          121      nil        Customer  \n",
              "3              83          208    94070        Customer  \n",
              "4              83           44    94064        Customer  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-07082ed2-aa92-400a-a08d-e24b0d768d8f\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>trip_id</th>\n",
              "      <th>duration_sec</th>\n",
              "      <th>start_date</th>\n",
              "      <th>start_station_name</th>\n",
              "      <th>start_station_id</th>\n",
              "      <th>end_date</th>\n",
              "      <th>end_station_name</th>\n",
              "      <th>end_station_id</th>\n",
              "      <th>bike_number</th>\n",
              "      <th>zip_code</th>\n",
              "      <th>subscriber_type</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>944732</td>\n",
              "      <td>2618</td>\n",
              "      <td>2015-09-24 17:22:00+00:00</td>\n",
              "      <td>Mezes</td>\n",
              "      <td>83</td>\n",
              "      <td>2015-09-24 18:06:00+00:00</td>\n",
              "      <td>Mezes</td>\n",
              "      <td>83</td>\n",
              "      <td>653</td>\n",
              "      <td>94063</td>\n",
              "      <td>Customer</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>984595</td>\n",
              "      <td>5957</td>\n",
              "      <td>2015-10-25 18:12:00+00:00</td>\n",
              "      <td>Mezes</td>\n",
              "      <td>83</td>\n",
              "      <td>2015-10-25 19:51:00+00:00</td>\n",
              "      <td>Mezes</td>\n",
              "      <td>83</td>\n",
              "      <td>52</td>\n",
              "      <td>nil</td>\n",
              "      <td>Customer</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>984596</td>\n",
              "      <td>5913</td>\n",
              "      <td>2015-10-25 18:13:00+00:00</td>\n",
              "      <td>Mezes</td>\n",
              "      <td>83</td>\n",
              "      <td>2015-10-25 19:51:00+00:00</td>\n",
              "      <td>Mezes</td>\n",
              "      <td>83</td>\n",
              "      <td>121</td>\n",
              "      <td>nil</td>\n",
              "      <td>Customer</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>1129385</td>\n",
              "      <td>6079</td>\n",
              "      <td>2016-03-18 10:33:00+00:00</td>\n",
              "      <td>Mezes</td>\n",
              "      <td>83</td>\n",
              "      <td>2016-03-18 12:14:00+00:00</td>\n",
              "      <td>Mezes</td>\n",
              "      <td>83</td>\n",
              "      <td>208</td>\n",
              "      <td>94070</td>\n",
              "      <td>Customer</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>1030383</td>\n",
              "      <td>5780</td>\n",
              "      <td>2015-12-06 10:52:00+00:00</td>\n",
              "      <td>Mezes</td>\n",
              "      <td>83</td>\n",
              "      <td>2015-12-06 12:28:00+00:00</td>\n",
              "      <td>Mezes</td>\n",
              "      <td>83</td>\n",
              "      <td>44</td>\n",
              "      <td>94064</td>\n",
              "      <td>Customer</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-07082ed2-aa92-400a-a08d-e24b0d768d8f')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-07082ed2-aa92-400a-a08d-e24b0d768d8f button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-07082ed2-aa92-400a-a08d-e24b0d768d8f');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "application/vnd.google.colaboratory.module+javascript": "\n      import \"https://ssl.gstatic.com/colaboratory/data_table/f872b2c2305463fd/data_table.js\";\n\n      window.createDataTable({\n        data: [[{\n            'v': 0,\n            'f': \"0\",\n        },\n{\n            'v': 944732,\n            'f': \"944732\",\n        },\n{\n            'v': 2618,\n            'f': \"2618\",\n        },\n\"2015-09-24 17:22:00+00:00\",\n\"Mezes\",\n{\n            'v': 83,\n            'f': \"83\",\n        },\n\"2015-09-24 18:06:00+00:00\",\n\"Mezes\",\n{\n            'v': 83,\n            'f': \"83\",\n        },\n{\n            'v': 653,\n            'f': \"653\",\n        },\n\"94063\",\n\"Customer\"],\n [{\n            'v': 1,\n            'f': \"1\",\n        },\n{\n            'v': 984595,\n            'f': \"984595\",\n        },\n{\n            'v': 5957,\n            'f': \"5957\",\n        },\n\"2015-10-25 18:12:00+00:00\",\n\"Mezes\",\n{\n            'v': 83,\n            'f': \"83\",\n        },\n\"2015-10-25 19:51:00+00:00\",\n\"Mezes\",\n{\n            'v': 83,\n            'f': \"83\",\n        },\n{\n            'v': 52,\n            'f': \"52\",\n        },\n\"nil\",\n\"Customer\"],\n [{\n            'v': 2,\n            'f': \"2\",\n        },\n{\n            'v': 984596,\n            'f': \"984596\",\n        },\n{\n            'v': 5913,\n            'f': \"5913\",\n        },\n\"2015-10-25 18:13:00+00:00\",\n\"Mezes\",\n{\n            'v': 83,\n            'f': \"83\",\n        },\n\"2015-10-25 19:51:00+00:00\",\n\"Mezes\",\n{\n            'v': 83,\n            'f': \"83\",\n        },\n{\n            'v': 121,\n            'f': \"121\",\n        },\n\"nil\",\n\"Customer\"],\n [{\n            'v': 3,\n            'f': \"3\",\n        },\n{\n            'v': 1129385,\n            'f': \"1129385\",\n        },\n{\n            'v': 6079,\n            'f': \"6079\",\n        },\n\"2016-03-18 10:33:00+00:00\",\n\"Mezes\",\n{\n            'v': 83,\n            'f': \"83\",\n        },\n\"2016-03-18 12:14:00+00:00\",\n\"Mezes\",\n{\n            'v': 83,\n            'f': \"83\",\n        },\n{\n            'v': 208,\n            'f': \"208\",\n        },\n\"94070\",\n\"Customer\"],\n [{\n            'v': 4,\n            'f': \"4\",\n        },\n{\n            'v': 1030383,\n            'f': \"1030383\",\n        },\n{\n            'v': 5780,\n            'f': \"5780\",\n        },\n\"2015-12-06 10:52:00+00:00\",\n\"Mezes\",\n{\n            'v': 83,\n            'f': \"83\",\n        },\n\"2015-12-06 12:28:00+00:00\",\n\"Mezes\",\n{\n            'v': 83,\n            'f': \"83\",\n        },\n{\n            'v': 44,\n            'f': \"44\",\n        },\n\"94064\",\n\"Customer\"]],\n        columns: [[\"number\", \"index\"], [\"number\", \"trip_id\"], [\"number\", \"duration_sec\"], [\"string\", \"start_date\"], [\"string\", \"start_station_name\"], [\"number\", \"start_station_id\"], [\"string\", \"end_date\"], [\"string\", \"end_station_name\"], [\"number\", \"end_station_id\"], [\"number\", \"bike_number\"], [\"string\", \"zip_code\"], [\"string\", \"subscriber_type\"]],\n        columnOptions: [{\"width\": \"1px\", \"className\": \"index_column\"}],\n        rowsPerPage: 25,\n        helpUrl: \"https://colab.research.google.com/notebooks/data_table.ipynb\",\n        suppressOutputScrolling: true,\n        minimumWidth: undefined,\n      });\n    "
          },
          "metadata": {},
          "execution_count": 50
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "Each row of the table corresponds to a different bike trip, and we can use an analytic function to **calculate the cumulative number of trips for each date in 2015.**"
      ],
      "metadata": {
        "id": "nY8VBY-8Rx8w"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Query to count the (cumulative) number of trips per day\n",
        "num_trips_query = \"\"\"\n",
        "                  WITH trips_by_day AS\n",
        "                  (\n",
        "                  SELECT DATE(start_date) AS trip_date,\n",
        "                      COUNT(*) as num_trips\n",
        "                  FROM `bigquery-public-data.san_francisco.bikeshare_trips`\n",
        "                  WHERE EXTRACT(YEAR FROM start_date) = 2015\n",
        "                  GROUP BY trip_date\n",
        "                  )\n",
        "                  SELECT *,\n",
        "                      SUM(num_trips) \n",
        "                          OVER (\n",
        "                               ORDER BY trip_date\n",
        "                               ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW\n",
        "                               ) AS cumulative_trips\n",
        "                      FROM trips_by_day\n",
        "                  \"\"\"\n",
        "\n",
        "# Run the query, and return a pandas DataFrame\n",
        "num_trips_result = client.query(num_trips_query).result().to_dataframe()\n",
        "num_trips_result.head()"
      ],
      "metadata": {
        "id": "_aHYi_0IR0WG",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 197
        },
        "outputId": "af0addb0-eb5f-4e0d-fe4b-49853affde91"
      },
      "execution_count": 51,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "    trip_date  num_trips  cumulative_trips\n",
              "0  2015-01-01        181               181\n",
              "1  2015-01-02        428               609\n",
              "2  2015-01-03        283               892\n",
              "3  2015-01-04        206              1098\n",
              "4  2015-01-05       1186              2284"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-6c452c63-a5a1-401b-804f-5cb9cb1e9b76\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>trip_date</th>\n",
              "      <th>num_trips</th>\n",
              "      <th>cumulative_trips</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>2015-01-01</td>\n",
              "      <td>181</td>\n",
              "      <td>181</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>2015-01-02</td>\n",
              "      <td>428</td>\n",
              "      <td>609</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>2015-01-03</td>\n",
              "      <td>283</td>\n",
              "      <td>892</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>2015-01-04</td>\n",
              "      <td>206</td>\n",
              "      <td>1098</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>2015-01-05</td>\n",
              "      <td>1186</td>\n",
              "      <td>2284</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-6c452c63-a5a1-401b-804f-5cb9cb1e9b76')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-6c452c63-a5a1-401b-804f-5cb9cb1e9b76 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-6c452c63-a5a1-401b-804f-5cb9cb1e9b76');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "application/vnd.google.colaboratory.module+javascript": "\n      import \"https://ssl.gstatic.com/colaboratory/data_table/f872b2c2305463fd/data_table.js\";\n\n      window.createDataTable({\n        data: [[{\n            'v': 0,\n            'f': \"0\",\n        },\n\"2015-01-01\",\n{\n            'v': 181,\n            'f': \"181\",\n        },\n{\n            'v': 181,\n            'f': \"181\",\n        }],\n [{\n            'v': 1,\n            'f': \"1\",\n        },\n\"2015-01-02\",\n{\n            'v': 428,\n            'f': \"428\",\n        },\n{\n            'v': 609,\n            'f': \"609\",\n        }],\n [{\n            'v': 2,\n            'f': \"2\",\n        },\n\"2015-01-03\",\n{\n            'v': 283,\n            'f': \"283\",\n        },\n{\n            'v': 892,\n            'f': \"892\",\n        }],\n [{\n            'v': 3,\n            'f': \"3\",\n        },\n\"2015-01-04\",\n{\n            'v': 206,\n            'f': \"206\",\n        },\n{\n            'v': 1098,\n            'f': \"1098\",\n        }],\n [{\n            'v': 4,\n            'f': \"4\",\n        },\n\"2015-01-05\",\n{\n            'v': 1186,\n            'f': \"1186\",\n        },\n{\n            'v': 2284,\n            'f': \"2284\",\n        }]],\n        columns: [[\"number\", \"index\"], [\"string\", \"trip_date\"], [\"number\", \"num_trips\"], [\"number\", \"cumulative_trips\"]],\n        columnOptions: [{\"width\": \"1px\", \"className\": \"index_column\"}],\n        rowsPerPage: 25,\n        helpUrl: \"https://colab.research.google.com/notebooks/data_table.ipynb\",\n        suppressOutputScrolling: true,\n        minimumWidth: undefined,\n      });\n    "
          },
          "metadata": {},
          "execution_count": 51
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "The query uses a [common table expression (CTE)](https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#with_clause) to first calculate the daily number of trips.  Then, we use **SUM()** as an aggregate function.\n",
        "- Since there is no **PARTITION BY** clause, the entire table is treated as a single partition.\n",
        "- The **ORDER BY** clause orders the rows by date, where earlier dates appear first. \n",
        "- By setting the **window frame** clause to `ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW`, we ensure that all rows up to and including the current date are used to calculate the (cumulative) sum. See https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts#def_window_frame for more details.\n",
        "\n",
        "The next query **tracks the stations where each bike began (in `start_station_id`) and ended (in `end_station_id`) the day on October 25, 2015.**"
      ],
      "metadata": {
        "id": "ShH0iValR196"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "# Query to track beginning and ending stations on October 25, 2015, for each bike\n",
        "start_end_query = \"\"\"\n",
        "                  SELECT bike_number,\n",
        "                      TIME(start_date) AS trip_time,\n",
        "                      FIRST_VALUE(start_station_id)\n",
        "                          OVER (\n",
        "                               PARTITION BY bike_number\n",
        "                               ORDER BY start_date\n",
        "                               ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING\n",
        "                               ) AS first_station_id,\n",
        "                      LAST_VALUE(end_station_id)\n",
        "                          OVER (\n",
        "                               PARTITION BY bike_number\n",
        "                               ORDER BY start_date\n",
        "                               ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING\n",
        "                               ) AS last_station_id,\n",
        "                      start_station_id,\n",
        "                      end_station_id\n",
        "                  FROM `bigquery-public-data.san_francisco.bikeshare_trips`\n",
        "                  WHERE DATE(start_date) = '2015-10-25' \n",
        "                  \"\"\"\n",
        "\n",
        "# Run the query, and return a pandas DataFrame\n",
        "start_end_result = client.query(start_end_query).result().to_dataframe()\n",
        "start_end_result.head()"
      ],
      "metadata": {
        "id": "ivG9NWgZR7QY",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 197
        },
        "outputId": "9428ba76-003e-4e15-b2f4-589e5c23782f"
      },
      "execution_count": 52,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "   bike_number trip_time  first_station_id  last_station_id  start_station_id  \\\n",
              "0           25  11:43:00                77               51                77   \n",
              "1           25  12:14:00                77               51                60   \n",
              "2          111  14:41:00                69               65                69   \n",
              "3          403  16:54:00                51               54                51   \n",
              "4          301  13:36:00                35               34                35   \n",
              "\n",
              "   end_station_id  \n",
              "0              60  \n",
              "1              51  \n",
              "2              65  \n",
              "3              54  \n",
              "4              35  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-72674f05-cc87-4aa7-91e0-ec5d6c148f23\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>bike_number</th>\n",
              "      <th>trip_time</th>\n",
              "      <th>first_station_id</th>\n",
              "      <th>last_station_id</th>\n",
              "      <th>start_station_id</th>\n",
              "      <th>end_station_id</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>25</td>\n",
              "      <td>11:43:00</td>\n",
              "      <td>77</td>\n",
              "      <td>51</td>\n",
              "      <td>77</td>\n",
              "      <td>60</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>25</td>\n",
              "      <td>12:14:00</td>\n",
              "      <td>77</td>\n",
              "      <td>51</td>\n",
              "      <td>60</td>\n",
              "      <td>51</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>111</td>\n",
              "      <td>14:41:00</td>\n",
              "      <td>69</td>\n",
              "      <td>65</td>\n",
              "      <td>69</td>\n",
              "      <td>65</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>403</td>\n",
              "      <td>16:54:00</td>\n",
              "      <td>51</td>\n",
              "      <td>54</td>\n",
              "      <td>51</td>\n",
              "      <td>54</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>301</td>\n",
              "      <td>13:36:00</td>\n",
              "      <td>35</td>\n",
              "      <td>34</td>\n",
              "      <td>35</td>\n",
              "      <td>35</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-72674f05-cc87-4aa7-91e0-ec5d6c148f23')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-72674f05-cc87-4aa7-91e0-ec5d6c148f23 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-72674f05-cc87-4aa7-91e0-ec5d6c148f23');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ],
            "application/vnd.google.colaboratory.module+javascript": "\n      import \"https://ssl.gstatic.com/colaboratory/data_table/f872b2c2305463fd/data_table.js\";\n\n      window.createDataTable({\n        data: [[{\n            'v': 0,\n            'f': \"0\",\n        },\n{\n            'v': 25,\n            'f': \"25\",\n        },\n\"11:43:00\",\n{\n            'v': 77,\n            'f': \"77\",\n        },\n{\n            'v': 51,\n            'f': \"51\",\n        },\n{\n            'v': 77,\n            'f': \"77\",\n        },\n{\n            'v': 60,\n            'f': \"60\",\n        }],\n [{\n            'v': 1,\n            'f': \"1\",\n        },\n{\n            'v': 25,\n            'f': \"25\",\n        },\n\"12:14:00\",\n{\n            'v': 77,\n            'f': \"77\",\n        },\n{\n            'v': 51,\n            'f': \"51\",\n        },\n{\n            'v': 60,\n            'f': \"60\",\n        },\n{\n            'v': 51,\n            'f': \"51\",\n        }],\n [{\n            'v': 2,\n            'f': \"2\",\n        },\n{\n            'v': 111,\n            'f': \"111\",\n        },\n\"14:41:00\",\n{\n            'v': 69,\n            'f': \"69\",\n        },\n{\n            'v': 65,\n            'f': \"65\",\n        },\n{\n            'v': 69,\n            'f': \"69\",\n        },\n{\n            'v': 65,\n            'f': \"65\",\n        }],\n [{\n            'v': 3,\n            'f': \"3\",\n        },\n{\n            'v': 403,\n            'f': \"403\",\n        },\n\"16:54:00\",\n{\n            'v': 51,\n            'f': \"51\",\n        },\n{\n            'v': 54,\n            'f': \"54\",\n        },\n{\n            'v': 51,\n            'f': \"51\",\n        },\n{\n            'v': 54,\n            'f': \"54\",\n        }],\n [{\n            'v': 4,\n            'f': \"4\",\n        },\n{\n            'v': 301,\n            'f': \"301\",\n        },\n\"13:36:00\",\n{\n            'v': 35,\n            'f': \"35\",\n        },\n{\n            'v': 34,\n            'f': \"34\",\n        },\n{\n            'v': 35,\n            'f': \"35\",\n        },\n{\n            'v': 35,\n            'f': \"35\",\n        }]],\n        columns: [[\"number\", \"index\"], [\"number\", \"bike_number\"], [\"string\", \"trip_time\"], [\"number\", \"first_station_id\"], [\"number\", \"last_station_id\"], [\"number\", \"start_station_id\"], [\"number\", \"end_station_id\"]],\n        columnOptions: [{\"width\": \"1px\", \"className\": \"index_column\"}],\n        rowsPerPage: 25,\n        helpUrl: \"https://colab.research.google.com/notebooks/data_table.ipynb\",\n        suppressOutputScrolling: true,\n        minimumWidth: undefined,\n      });\n    "
          },
          "metadata": {},
          "execution_count": 52
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "The query uses both **FIRST_VALUE()** and **LAST_VALUE()** as analytic functions.\n",
        "- The **PARTITION BY** clause breaks the data into partitions based on the `bike_number` column.  Since this column holds unique identifiers for the bikes, this ensures the calculations are performed separately for each bike.\n",
        "- The **ORDER BY** clause puts the rows within each partition in chronological order.\n",
        "- Since the **window frame** clause is `ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING`, for each row, its entire partition is used to perform the calculation.  (_This ensures the calculated values for rows in the same partition are identical._)"
      ],
      "metadata": {
        "id": "iegKJa1nR_rZ"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "You can check https://cloud.google.com/bigquery/docs/reference/standard-sql/introduction and https://googleapis.dev/python/bigquery/latest/index.html for more details."
      ],
      "metadata": {
        "id": "4XUIvA2v2A30"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "## Data Wrangling with Pandas"
      ],
      "metadata": {
        "id": "Sd-M02wrbNle"
      }
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "LU-ma6F6bdkB"
      },
      "source": [
        "### `Series` objects\n",
        "The `pandas` library contains these useful data structures:\n",
        "* `Series` objects, that we will discuss now. A `Series` object is 1D array, similar to a column in a spreadsheet (with a column name and row labels).\n",
        "* `DataFrame` objects. This is a 2D table, similar to a spreadsheet (with column names and row labels)."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "q6En2jWCbdkC"
      },
      "source": [
        "#### Creating a `Series`\n",
        "Let's start by creating our first `Series` object!"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 2,
      "metadata": {
        "id": "Twbix6NpbdkC",
        "outputId": "5c5ec161-1d04-4701-d00f-099eacdfc3cb",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "0    2\n",
              "1   -1\n",
              "2    3\n",
              "3    5\n",
              "dtype: int64"
            ]
          },
          "metadata": {},
          "execution_count": 2
        }
      ],
      "source": [
        "s = pd.Series([2,-1,3,5])\n",
        "s"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6ERNyjvMbdkE"
      },
      "source": [
        "Arithmetic operations on `Series` are also possible, and they apply *elementwise*, just like for `ndarray`s:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 3,
      "metadata": {
        "id": "qCPOAd6tbdkF",
        "outputId": "592e0e3f-0207-42e9-87f8-b62b22aea267",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "0    1002\n",
              "1    1999\n",
              "2    3003\n",
              "3    4005\n",
              "dtype: int64"
            ]
          },
          "metadata": {},
          "execution_count": 3
        }
      ],
      "source": [
        "s + [1000,2000,3000,4000]"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 4,
      "metadata": {
        "id": "WOQxfgJEbdkF",
        "outputId": "92e8e5e8-5e42-4b5f-dbe1-bace97c1a97a",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "0    1002\n",
              "1     999\n",
              "2    1003\n",
              "3    1005\n",
              "dtype: int64"
            ]
          },
          "metadata": {},
          "execution_count": 4
        }
      ],
      "source": [
        "s + 1000"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "05v47g-TbdkG"
      },
      "source": [
        "#### Index labels\n",
        "Each item in a `Series` object has a unique identifier called the *index label*. By default, it is simply the rank of the item in the `Series` (starting at `0`) but you can also set the index labels manually:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 5,
      "metadata": {
        "id": "uToGC_H1bdkG",
        "outputId": "e7592d19-7aa9-4534-a111-b82f6a6c08a1",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "alice       68\n",
              "bob         83\n",
              "charles    112\n",
              "darwin      68\n",
              "dtype: int64"
            ]
          },
          "metadata": {},
          "execution_count": 5
        }
      ],
      "source": [
        "s2 = pd.Series([68, 83, 112, 68], index=[\"alice\", \"bob\", \"charles\", \"darwin\"])\n",
        "s2"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "YHll54B1bdkH"
      },
      "source": [
        "You can then use the `Series` just like a `dict`:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 6,
      "metadata": {
        "id": "k_5q4EuqbdkH",
        "outputId": "cbf7a319-2c09-4b4a-b011-8d658aa4900c",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "83"
            ]
          },
          "metadata": {},
          "execution_count": 6
        }
      ],
      "source": [
        "s2[\"bob\"]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mmMlzPbLbdkH"
      },
      "source": [
        "You can still access the items by integer location, like in a regular array:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 7,
      "metadata": {
        "id": "xhH_OoLQbdkI",
        "outputId": "6b629478-22e5-48e2-c7f5-6e85a1a40cd3",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "83"
            ]
          },
          "metadata": {},
          "execution_count": 7
        }
      ],
      "source": [
        "s2[1]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TFDcqmL2bdkI"
      },
      "source": [
        "To make it clear when you are accessing, it is recommended to always use the `loc` attribute when accessing by label, and the `iloc` attribute when accessing by integer location:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 8,
      "metadata": {
        "id": "86J9jGtfbdkI",
        "outputId": "a26d3b9f-2b19-47aa-9bf8-8e4f13a48ab7",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "83"
            ]
          },
          "metadata": {},
          "execution_count": 8
        }
      ],
      "source": [
        "s2.loc[\"bob\"]"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 9,
      "metadata": {
        "id": "TMjzFBctbdkI",
        "outputId": "4e049de9-0899-4fef-987b-8677f4af8f50",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "83"
            ]
          },
          "metadata": {},
          "execution_count": 9
        }
      ],
      "source": [
        "s2.iloc[1]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "tYYk0kN2bdkJ"
      },
      "source": [
        "Slicing a `Series` also slices the index labels:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 10,
      "metadata": {
        "id": "pGJh6BLRbdkK",
        "outputId": "38fc2d8a-f54e-49b3-8636-13d44b1ec23c",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "bob         83\n",
              "charles    112\n",
              "dtype: int64"
            ]
          },
          "metadata": {},
          "execution_count": 10
        }
      ],
      "source": [
        "s2.iloc[1:3]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hzw9bWbBbdkM"
      },
      "source": [
        "#### Init from `dict`\n",
        "You can create a `Series` object from a `dict`. The keys will be used as index labels:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 11,
      "metadata": {
        "id": "u-QkmNSSbdkM",
        "outputId": "2a88806a-f222-405e-8ffe-249fbfebe023",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "alice     68\n",
              "bob       83\n",
              "colin     86\n",
              "darwin    68\n",
              "dtype: int64"
            ]
          },
          "metadata": {},
          "execution_count": 11
        }
      ],
      "source": [
        "weights = {\"alice\": 68, \"bob\": 83, \"colin\": 86, \"darwin\": 68}\n",
        "s3 = pd.Series(weights)\n",
        "s3"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "SM3qtn08bdkM"
      },
      "source": [
        "When an operation involves multiple `Series` objects, `pandas` automatically aligns items by matching index labels."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 12,
      "metadata": {
        "id": "5AI1dciMbdkM",
        "outputId": "132b6c95-3423-457a-cd78-2ecec6c6a120",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "Index(['alice', 'bob', 'charles', 'darwin'], dtype='object')\n",
            "Index(['alice', 'bob', 'colin', 'darwin'], dtype='object')\n"
          ]
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "alice      136.0\n",
              "bob        166.0\n",
              "charles      NaN\n",
              "colin        NaN\n",
              "darwin     136.0\n",
              "dtype: float64"
            ]
          },
          "metadata": {},
          "execution_count": 12
        }
      ],
      "source": [
        "print(s2.keys())\n",
        "print(s3.keys())\n",
        "\n",
        "s2 + s3"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6JMZev9LbdkN"
      },
      "source": [
        "The resulting `Series` contains the union of index labels from `s2` and `s3`. Since `\"colin\"` is missing from `s2` and `\"charles\"` is missing from `s3`, these items have a `NaN` result value. (ie. Not-a-Number means *missing*).\n",
        "\n",
        "Automatic alignment is very handy when working with data that may come from various sources with varying structure and missing items"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "H57kzHKabdkN"
      },
      "source": [
        "#### Init with a scalar\n",
        "You can also initialize a `Series` object using a scalar and a list of index labels: all items will be set to the scalar."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 13,
      "metadata": {
        "id": "fz5q8tyGbdkN",
        "outputId": "a50c358f-3049-464a-bd4d-587fad2b067b",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "life          42\n",
              "universe      42\n",
              "everything    42\n",
              "dtype: int64"
            ]
          },
          "metadata": {},
          "execution_count": 13
        }
      ],
      "source": [
        "meaning = pd.Series(42, [\"life\", \"universe\", \"everything\"])\n",
        "meaning"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "lQUC1nqObdkO"
      },
      "source": [
        "Pandas makes it easy to plot `Series` data using matplotlib (for more details on matplotlib. Just import matplotlib and call the `plot()` method:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 14,
      "metadata": {
        "scrolled": true,
        "id": "IQFkp_wlbdkO",
        "outputId": "e4cc9a2f-5028-43ac-ead3-e9d8e031bb7f",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 265
        }
      },
      "outputs": [
        {
          "output_type": "display_data",
          "data": {
            "text/plain": [
              "<Figure size 432x288 with 1 Axes>"
            ],
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD4CAYAAAD8Zh1EAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nO3dd3jV5f3/8ec7e0IgCTvMMGSPyBCk4h4o1l131S+14modX7W7tdVWq9b9U+rexIG1ihMHDjRhb8IOM2GEJJB9//7g6JdSRkhOcuecvB7XlStnfMx5HeF65cN97s99m3MOEREJfRG+A4iISHCo0EVEwoQKXUQkTKjQRUTChApdRCRMRPl64bS0NNe1a1dfLy8iEpJyc3MLnXPp+3vOW6F37dqVnJwcXy8vIhKSzGzNgZ7TkIuISJhQoYuIhAkVuohImFChi4iECRW6iEiYUKGLiIQJFbqISJjwNg9d/Cotr+LbVdtYtrmYgZ1SGNolhdioSN+xRKQeVOjNRGV1DXPX7WBGXiFf5W1l1trtVNX831r4cdERDO+WypjMVI7qkUbf9i2IiDCPiUXkcKnQw5RzjmWbS5iRV8iXeYXMXLmV0opqzGBAx5b8z9jujMlMo3e7ZOas3fHDcX95dwkArRKiOSozjTGZaYzukUbn1ATP70hEDkWFHkbW79jNl4Fi/jJvK4Ul5QB0S0vkx0M7MiYzjZHdU0lJiPmP/+74vm05vm9bADbvLOOrFYXMWL6VL/MK+fe8jQBktI5nTGYaR/VI46geqaQmxTbumxORQzJfW9BlZWU5reVSP0W7Kvl6ZWHg7HorqwpLAUhLimF0ZtoPXx1T4uv0851zrCgoDRR8IV+v3EpxWRUAfdu3YEzPPeU+vFtrEmJ0biDSGMws1zmXtd/nVOiho6yymtw1238YHpm/vgjnIDEmkhHdUxkdGCLp1TYJs+CPf1dV1zB/fRFfrdjKjOWF5K7ZTkV1DdGRxtDOrX74BTKoU0uiIjWBSqQhqNBDVHWNY+GGoh8KPGf1dsqraoiKMIZ0TvmhwAdlpBDtoUB3V1STs2bbD/kWbtiJc5AcGxX4BZPKmMw0Mts0zC8YkeboYIWufyc3MRuLdvPR4i18GRjiKNpdCUCfdslcPLILYzLTOLJba5Ji/f/RxcdEcnTPdI7uuWdp5u2lFXy9cusPBf/R4s0AtEmO3TP+nplGn3bJRDRSuUdGGD3bJGm2jjQbOkNvQop2V3L0Xz9hZ1kVHVPiGZ25ZxjlqB5ppCeH3oeQ67bt2jP+nreVr/IK2Vpa0egZjumdzhOXZBETpSEgCQ86Qw8R78zbwM6yKl64cgSjM1NDfpgio3UC57fuzPlHdqamxrFkUzFrt+1qtNdfvrmYv3+4jBtfnc2DFwzRuL6EPRV6EzIlJ5/ebZPDosz3FRFh9O3Qgr4dWjTaa57cvx3xMZHc+e/FJMTM529nD9Twi4Q1FXoTkbelmDnrdvCrU48IuzL36aqju1NSXsUDHy0nKTaK353eV/9/JWyp0JuIKbn5REYYZw7p6DtK2LnhuJ4Ul1XxzxmrSI6L4qYTe/uOJNIgajWoaGYpZpZtZkvMbLGZjdrneTOzB80sz8zmmdnQhokbnqqqa3hj1nrG9W4Tkh9+NnVmxq9PO4ILjszgoU/yePyzFb4jiTSI2p6h/wOY5pw7x8xigH0X9jgF6Bn4GgE8FvgutfD58gIKiss5N6uT7yhhy8z4848HUFJexd3vLSEpNoqLR3bxHUskqA5Z6GbWEhgLXA7gnKsA9p1/NgF4zu2ZA/lN4Iy+vXNuY5DzhqXs3HxaJ8Ywrncb31HCWmSEcf/5g9ldUc1vpi4gKTZKQ1wSVmoz5NINKACeNrPZZjbZzBL3OaYjsG6v+/mBx/6DmU00sxwzyykoKKhz6HCyvbSCjxZt4czBHTVXuhFER0bwyEVDGdktlZumzOWDhZt8RxIJmto0SBQwFHjMOTcEKAVuq8uLOeeecM5lOeey0tPT6/Ijws7UOeupqK7RcEsjiouO5MnLshjQsSXXvjSbGcsLfUcSCYraFHo+kO+cmxm4n82egt/beiBjr/udAo/JIUzJzad/xxYc0b7x5mcLJMVG8cxPj6R7eiL/81wOuWu2+Y4kUm+HLHTn3CZgnZl9P9frOGDRPoe9DVwamO0yEijS+PmhLdqwk4UbdnLOUJ2d+5CSEMPzV46gXcs4Ln/6OxasL/IdSaReajtoex3wopnNAwYDfzGzq83s6sDz7wIrgTzgSeCaoCcNQ9m5+cRERjBhsD6Y8yU9OZYXrhpBcmwUlz71LXlbin1HEqkzLc7lSUVVDSPv+piR3Vvz6EXDfMdp9lYVlnLu418TFWFMuXoUGa215Z40TQdbnEvTKjz5ZMkWtpVWcO6wjEMfLA2uW1oiz185nN2V1Vw0eSabd5b5jiRy2FTonmTn5tMmOZaje6b5jiIBR7RvwTM/PZKtJeVcPHkm2zws9ytSHyp0DwqKy5m+dAs/HtpRS7o2MUM6t2LyZUeydtsuLnvqW3aWVfqOJFJrahMP3pq9nuoap+GWJmpUj1Qeu3goizfu5KpncthdUe07kkitqNAbmXOOKbnrGNI5hcw2Sb7jyAEc26ctD1wwmJw12/jZC7mUV6nUpelToTeyeflFLNtcwjnDNPe8qRs/sAN3nzWQz5cVcOMrc6iqrvEdSeSgVOiNLDs3n9ioCE4f1MF3FKmF847M4Dfj+/Legk387+vzqanxM81XpDa0wUUjKqusZuqc9Zzcvx0t4qJ9x5FaunJMN0rKqrj/o2UkxUby+zP6adcjaZJU6I3ow0Wb2VlWpQ9DQ9D1x2VSUl7Jk1+sIikuiltO6uM7ksh/UaE3oim5+XRoGceoHqm+o8hhMjPuOPUISsqreGT6ChJjo7jmmEzfsUT+gwq9kWwqKmPG8gImjcskUjvPhyQz484zB1BaXs3fpi0lOTaKS0Z19R1L5Acq9Eby+qx8ahya3RLiIiOMv583iF0VVfxm6kISYqI4W3+m0kRolksjcM6RnZvP8G6t6ZK672ZPEmqiIyN4+MKhHNUjlVuy5zJtgXY9kqZBhd4IctdsZ1Vhqc7Ow0hcdCRPXprFoIwUrn95Np8v05aK4p8KvRFk5+aTEBPJaQPa+44iQZQYG8Uzlw+nR5skJj6fw3erteuR+KVCb2C7Kqp4Z95GTh3QnsRYfWQRblomRPPcFcPp0DKeK57+jrwtJb4jSTOmQm9g0xZsoqS8inM13BK20pNjef6qEcRERTDpxVlazEu8UaE3sCk5+XRuncDwbq19R5EG1DElnvvPH8yyLcX8duoC33GkmVKhN6B123bx9cqtnDOsky4VbwbG9krnunGZTMnNZ0rOOt9xpBlSoTeg12flY4bmKTcjNxzfi1HdU/nN1AUs3aQNp6VxqdAbSE3Nnrnno3uk0TEl3nccaSSREcY/fjKYpNhofv5iLqXlVb4jSTOiQm8g36zaSv723Zp73gy1SY7jwZ8MZnVhKXe8OR/ntOSuNA4VegPJzs0nOTaKk/q18x1FPDiqRxq/OL4XU+ds4OVvNZ4ujUOF3gBKyqt4b/4mxg/qQHxMpO844smkcZmM7ZXO7/+1kAXri3zHkWZAhd4A/j1vA7srqzk3S8MtzVlEhHH/eYNonRDDtS/NYmdZpe9IEuZU6A1gSk4+3dMTGZKR4juKeJaaFMtDFw5h3fbd3Pb6PI2nS4NSoQfZqsJSctZs59xhGZp7LgAc2bU1t5zUm3fnb+K5r9f4jiNhrFaFbmarzWy+mc0xs5z9PH+MmRUFnp9jZr8NftTQkJ27jgiDs4Z29B1FmpCJR3fnuD5tuPPfi5i7bofvOBKmDucMfZxzbrBzLusAz38ReH6wc+6PwQgXaqprHK/nrudHvdJp2yLOdxxpQiICG2O0SY5j0kuzKNql8XQJPg25BNGMvEI27SzjHG0CLfuRkhDDwxcOYfPOMm7OnqvxdAm62ha6Az4ws1wzm3iAY0aZ2Vwze8/M+u3vADObaGY5ZpZTUBB+GwJk5+aTkhDN8X3b+I4iTdSQzq247ZQj+HDRZv45Y5XvOBJmalvoY5xzQ4FTgElmNnaf52cBXZxzg4CHgLf290Occ08457Kcc1np6el1Dt0UFe2q5P2Fm5gwqAOxUZp7Lgd2xeiunNSvLXe/t4TcNdt9x5EwUqtCd86tD3zfArwJDN/n+Z3OuZLA7XeBaDNLC3LWJu3teRuoqKrRcIsckpnxt3MG0T4ljmtfmsW20grfkSRMHLLQzSzRzJK/vw2cCCzY55h2FpijZ2bDAz93a/DjNl3ZOevo0y6Z/h1b+I4iIaBlfDSPXjiMrSUV/PK1OdTUaDxd6q82Z+htgRlmNhf4Fvi3c26amV1tZlcHjjkHWBA45kHgAteMPvFZtrmYuflFWvdcDsuATi35zfgj+HRpAY9/vsJ3HAkDh9zk0jm3Ehi0n8cf3+v2w8DDwY0WOrJz84mKMM4cornncnguHtmFmau2ce/7SxnWuRUjuqf6jiQhTNMW66myuoY3Zq1nXJ82pCXF+o4jIcbMuOusAXRJTeS6l2dTWFLuO5KEMBV6PX22tIDCknJtAi11lhwXzSMXDqVodyU3vjKHao2nSx2p0OspOzeftKQYxvXR3HOpu74dWvCHM/oxI6+Qhz/J8x1HQpQKvR62lVbw8ZLNnDm4I9GR+l8p9XP+kRmcNaQjD3y8jC/zCn3HkRCkFqqHt2avp7LacY7WPZcgMDPu/HF/eqQnccMrs9mys8x3JAkxKvR6mJKbz4COLenTTnPPJTgSYqJ47KKhlJZXc93Ls6mqrvEdSUKICr2OFm4oYvHGndqVSIKuZ9tk7jyzPzNXbeP+j5b5jiMhRIVeR1Ny8omJjOCMQR18R5EwdPawTpyflcEj01cwfekW33EkRKjQ66Ciqoapc9ZzQt+2pCTE+I4jYeoPE/rRp10yv3x1Dht27PYdR0KACr0OPl68me27KvVhqDSouOhIHr1oKBVVNVz38mwqNZ4uh6BCr4Ps3HzatohlbM/wWgJYmp7u6UncffZActds5573l/qOI02cCv0wbSku49NlBZw1tBOREVqISxre6YM6cPHIzjzx+Uo+XLTZdxxpwlToh+nNWeuprnGco0v9pRH9+rS+9O/Ygptem8O6bbt8x5EmSoV+GJxzTMnNZ2jnFHqkJ/mOI81IXHQkj1w4FOfg2pdmUVGl8XT5byr0wzA3v4i8LSWcm6VdiaTxdUlN5J5zBzI3v4i/vLvYdxxpglToh2FKzjrioiM4bWB731GkmTq5f3t+Ororz3y1mvfmb/QdR5oYFXotlVVW8/bcDZzcrx0t4qJ9x5Fm7PZTjmBwRgq3Zs9jdWGp7zjShKjQa+n9hZsoLqvScIt4FxMVwcMXDiEiwpj00izKKqt9R5ImQoVeS9m5+XRMiWeUtgiTJqBTqwTuO28QCzfs5E/vLPIdR5oIFXotbNixmxl5hZw9rBMRmnsuTcRxR7TlZ2O78+LMtUyds953HGkCVOi18MasfJyDc4Zq7rk0LTef1JusLq244435rCgo8R1HPFOhH4JzjuzcfEZ0a03n1ATfcUT+Q3RkBA9dOITY6EgmvTiL3RUaT2/OVOiH8N3q7azeuksfhkqT1b5lPPedN4ilm4v53dsLfMcRj1Toh/Dyt2tJjInk1AHtfEcROaBjerdh0jGZvJaTT3Zuvu844okK/SCWbipm6pz1/GR4ZxJionzHETmoG4/vycjurfn1W/NZtrnYdxzxQIV+EH+btoTE2Cgmjcv0HUXkkKIiI3jwgiEkxUZzzYuzKC2v8h1JGpkK/QBmrtzKx0u2cM0xmbRK1K5EEhratIjjwQsGs6KghF+/tQDnnO9I0ohU6PvhnOPuaUto1yKOn47u6juOyGE5KjONG4/rxZuz1/Pqd+t8x5FGVKtCN7PVZjbfzOaYWc5+njcze9DM8sxsnpkNDX7UxvP+wk3MXruDX5zQk7joSN9xRA7btcdmcnTPNH779kIWbdjpO440ksM5Qx/nnBvsnMvaz3OnAD0DXxOBx4IRzoeq6hr+Nm0pmW2SOFsXEkmIioww7j9/MK0Sopn00iyKyyp9R5JGEKwhlwnAc26Pb4AUMwvJNWZfy8lnZWEpt57Um6hIjUhJ6EpLiuXBC4awdtsubntjvsbTm4HaNpYDPjCzXDObuJ/nOwJ7D9blBx77D2Y20cxyzCynoKDg8NM2sF0VVTzw0TKyurTihL5tfccRqbcR3VO56cRe/HveRl74Zo3vONLAalvoY5xzQ9kztDLJzMbW5cWcc08457Kcc1np6el1+REN6ukvV7OluJzbTumDmRbhkvBw9dgejOudzp/eWcz8/CLfcaQB1arQnXPrA9+3AG8Cw/c5ZD2w97XxnQKPhYxtpRU8/ukKTujblqyurX3HEQmaiAjjvvMGk5YUwzUv5VK0W+Pp4eqQhW5miWaW/P1t4ERg3wUj3gYuDcx2GQkUOedCan+shz/Jo7SiiltP6u07ikjQtUqM4aELh7JxRxm3Zs/VeHqYqs0ZeltghpnNBb4F/u2cm2ZmV5vZ1YFj3gVWAnnAk8A1DZK2gazbtovnv1nNucMy6Nk22XcckQYxrEsrbjulD+8v3MxTX672HUcawCEXKHHOrQQG7efxx/e67YBJwY3WeO77cBkRZvzihF6+o4g0qCvHdGPmqm3c9e5ihnROYWjnVr4jSRA1+3l5CzcU8dac9VwxphvtWsb5jiPSoMyMe88ZRLuWcVz30mx27KrwHUmCqNkX+l+nLaVFXDRX/6iH7ygijaJlQjSPXjSUguJybnptLjU1Gk8PF8260L/MK+TzZQVcOy6TlvHRvuOINJqBnVL41WlH8PGSLTzxxUrfcSRImm2h19Q47n5vCR1T4rlkVBffcUQa3aWjunDagPbc8/5Svlu9zXccCYJmW+jvLtjI/PVF/PKEXlqAS5olM+PusweQ0Sqea1+axdaSct+RpJ6aZaFXVtdwz/tL6dMumTOH/NcKBSLNRnJcNI9cNJTtuyq58dU5Gk8Pcc2y0F/+di1rtu7if0/uQ2SELvGX5q1fh5b8/vR+fLG8kEem5/mOI/XQ7Aq9pLyKBz9ezohurTmmd9NbT0bEh58Mz+DMwR24/6NlfLWi0HccqaNmV+iTv1hJYUmFFuAS2YuZ8ecfD6BbWiLXvzyHLcVlviNJHTSrQi8oLufJz1dy6oB2DNEVciL/ITE2ikcvGkZJeSU3vDyHao2nh5xmVegPf7Kcsqoabj5RC3CJ7E/vdsn8aUJ/vl65lX98tMx3HDlMzabQVxeW8uLMtVxwZAbd05N8xxFpss7NyuDcYZ14aHoeny9rehvRyIE1m0K/94OlREdGcMNxPX1HEWny/jihP73aJHPjq3PYVKTx9FDRLAp9Xv4O3pm3kauO7kabFlqAS+RQ4mMieeSioZRVVnPdy7Ooqq7xHUlqIewL3bk9l/i3Toxh4tjuvuOIhIzMNkncddYAvlu9nXs/0Hh6KAj7Qv98eSFfrdjKdcdmkhynBbhEDseEwR25cERnHv9sBdOXbvEdRw4hrAv9+wW4MlrHc+GIzr7jiISk347vS++2ydzxxnxKyqt8x5GDCOtCf3vuBhZv3MnNJ/YmNkoLcInURVx0JHedPYBNO8v4+wdLfceRgwjbQi+vqubeD5bSr0MLTh/YwXcckZA2tHMrLhnZhWe+Ws3cdTt8x5EDCNtCf/GbteRv381tp/QhQgtwidTbLSf1pk1yLLe/MV+zXpqosCz0nWWVPPTJcsZkpnF0Ty3AJRIMyXHR/OGMfizauJOnvlzlO47sR1gW+hOfrWT7rkr+9+Q+vqOIhJWT+rXj+CPacv+Hy1m3bZfvOLKPsCv0LTvLmDxjJacP6sCATi19xxEJK2bGHyf0I8Lg128twDkt4NWUhF2hP/DxcqprHDef2Mt3FJGw1CElnptP6s1nywr417yNvuPIXsKq0FcUlPDqd+u4aEQXuqQm+o4jErYuHdWVQZ1a8sd/LaRoV6XvOBIQVoV+z7SlxEVFcO2xmb6jiIS1yAjjL2cNYPuuSu56b7HvOBIQNoU+a+12pi3cxMSxPUhLivUdRyTs9evQkivHdOOV79bx7aptvuMIYVLozjnufncJaUkxXHV0N99xRJqNG4/vSadW8dz+xjzKq6p9x2n2al3oZhZpZrPN7J39PHe5mRWY2ZzA11XBjXlw05du4dvV27jhuJ4kxkY15kuLNGsJMVHceWZ/VhSU8vinK33HafYO5wz9BuBgg2WvOucGB74m1zNXrVXXOP763lK6piZwwXAtwCXS2I7p3YbTB3Xgkel55G0p8R2nWatVoZtZJ+A0oNGKurbemJXP0s3F3HJSH6Ijw2IESSTk/HZ8X+KiI/jVm/M1N92j2jbgA8CtwMEWcDjbzOaZWbaZZezvADObaGY5ZpZTUFD/vQrLKqu578NlDOrUklMHtKv3zxORuklPjuWOU49g5qptTMnJ9x2n2TpkoZvZeGCLcy73IIf9C+jqnBsIfAg8u7+DnHNPOOeynHNZ6en1X2Plua9Xs7GojP89pQ9mWoBLxKfzsjIY3rU1f353MYUl5b7jNEu1OUMfDZxhZquBV4BjzeyFvQ9wzm11zn3/JzgZGBbUlPtRtKuSR6av4Ee90jmqR1pDv5yIHEJEhPGXs/qzq6KKP72zyHecZumQhe6cu90518k51xW4APjEOXfx3seYWfu97p7BwT88DYpHP8tjZ5kW4BJpSjLbJPPzYzKZOmcDny2r/7CqHJ46f4poZn80szMCd683s4VmNhe4Hrg8GOEOZMOO3Tz95Wp+PLgjfTu0aMiXEpHDdM0xPeiensiv35rP7grNTW9Mh1XozrlPnXPjA7d/65x7O3D7dudcP+fcIOfcOOfckoYI+70HPloGDn5xghbgEmlq4qIj+cuPB7Bu224e+HiZ7zjNSsjN81u2uZjs3HwuGdWFjNYJvuOIyH6M7J7KeVmdmPzFKhZt2Ok7TrMRcoW+ZWc5mW2SmDROC3CJNGV3nHoEKfHR3P7mfKprNDe9MYRcoY/pmcb7N46ldWKM7ygichApCTH89vS+zF23g+e/Xu07TrMQcoUOaM65SIg4Y1AHju6Zxj3vL2Vj0W7fccJeSBa6iIQGM+PPZw6g2jl+N3Wh7zhhT4UuIg2qc2oCNxzXiw8WbWbagk2+44Q1FbqINLirju5Gn3bJ/P7thRSXacu6hqJCF5EGFx0Zwd1nD2RzcRn3vr/Ud5ywpUIXkUYxOCOFS0d24blv1jB77XbfccKSCl1EGs3NJ/WmbXIct78xn8rqg63GLXWhQheRRpMcF80fJvRjyaZiJn+xynecsKNCF5FGdVK/dpzYty3/+HgZa7fu8h0nrKjQRaTR/WFCP6IiIvjVW9qyLphU6CLS6Nq3jOfmE3vxxfJCps7Z4DtO2FChi4gXl4zqyqCMFP70ziJ27KrwHScsqNBFxIvICOPuswawY3clf3m3wTc5axZU6CLizRHtW3DV0d14LSefr1ds9R0n5KnQRcSrG4/rRUbreH715nzKKrVlXX2o0EXEq/iYSO48cwArC0t59NMVvuOENBW6iHj3o17pTBjcgcc+zSNvS7HvOCFLhS4iTcJvxvclISaK29+YT422rKsTFbqINAlpSbHccWofvlu9nVdz1vmOE5JU6CLSZJyXlcGIbq25693FfLhos64iPUwqdBFpMsyMv549kNSkWP7nuRzOfuwrvlmp6Yy1pUIXkSala1oiH/xiLHedNYD1O3ZzwRPfcOlT37JgfZHvaE2e+fonTVZWlsvJyfHy2iISGsoqq3n2q9U8+ukKinZXMn5ge246sTfd0hJ9R/PGzHKdc1n7fU6FLiJNXdHuSp78fCX/nLGKiuoazsvqxPXH9aR9y3jf0RrdwQq91kMuZhZpZrPN7J39PBdrZq+aWZ6ZzTSzrnWPKyLyn1rGR3PzSb35/NZxXDKyC9m5+Rxzz6fc9e5itpdqYa/vHc4Y+g3AgVbQuRLY7pzLBO4H/lrfYCIi+0pPjuX3Z/Tjk5uO4bQB7Xnii5WM/dt0Hvp4OaXlVb7jeVerQjezTsBpwOQDHDIBeDZwOxs4zsys/vFERP5bRusE7jt/MNNuGMvIHqn8/cNl/Oie6Tzz5SrKq5rvejC1PUN/ALgVONCurh2BdQDOuSqgCEjd9yAzm2hmOWaWU1BQUIe4IiL/p3e7ZJ68NIvXf34UPdKT+P2/FnHc3z/j9dx8qpvh1aaHLHQzGw9scc7l1vfFnHNPOOeynHNZ6enp9f1xIiIADOvSilcmjuTZK4bTMj6am6bM5ZR/fM4HCzc1q4uTanOGPho4w8xWA68Ax5rZC/scsx7IADCzKKAloKsBRKTRmBk/6pXOv64dw8MXDqGy2jHx+VzOeuyrZrPW+iEL3Tl3u3Ouk3OuK3AB8Ilz7uJ9DnsbuCxw+5zAMc3n16KINBkREcb4gR1+uDhp444yfvJk87g4qc5XiprZH83sjMDdfwKpZpYH/BK4LRjhRETqKjoygp8M78yntxzDHaf2YV7+DsY/NINJL81iZUGJ73gNQhcWiUizsLPs/y5OKq8K3YuTdKWoiEhAQXE5j0zP48WZazAzLj+qKz//UQ9aJcb4jlYrQblSVEQkHOx9cdL4ge158ouVnPKPL1i3bZfvaPWmQheRZimjdQL3nTeYqZNGs7uymosmz2TzzjLfsepFhS4izdrATik8e8VwtpaUc/HkmWwL4bVhVOgi0uwNzkhh8mVHsnbbLi576lt2llX6jlQnKnQREWBUj1Qeu3goizfu5MpnvmN3ReitCaNCFxEJOLZPWx64YDC5a7Yz8fmckFvoS4UuIrKX8QM7cPdZA/lieSE3vDyHquoDrUnY9KjQRUT2cd6RGfxmfF+mLdzEra/PoyZEVm6M8h1ARKQpunJMN0rLq7jvw2UkxUbxhzP60dS3eVChi4gcwHXHZlJSXsUTn68kOS6KW07q4zvSQanQRUQOwMy4/ZQ+FJdV8cj0FSTGRnHNMZm+Yx2QCl1E5CDMjDvP7M+uiir+Nm0pybFRXB6yN4UAAAaDSURBVDKqq+9Y+6VCFxE5hMgI495zB1FaXs1vpi4kISaKs4d18h3rv2iWi4hILURHRvDwhUM4qkcqt2TPZdqCjb4j/RcVuohILcVFR/LkpVkMykjhupdn89myprXZvQpdROQwJMZG8czlw8lsk8zPns/hu9XbfEf6gQpdROQwtUyI5vkrh9MhJZ4rnv6O+flNY69SFbqISB2kJcXywpUjaBEfzaVPzWT55mLfkVToIiJ11SElnhevGkFUZAQXTZ7J2q1+dz1SoYuI1EPXtEReuHIEFdU1XDj5GzYV+dv1SIUuIlJPvdsl8+xPh7NjVyUXTf6GrSXlXnKo0EVEgmBQRgqTL8sif/tuLvW065EKXUQkSEZ2T+XxS4axbHMxVzz9Hbsqqhr19VXoIiJBNK53Gx44fwiz1m7nZ8/nNuquRyp0EZEgO21ge+4+e8+uR9e9NLvRdj1SoYuINIDzsjL43el9+WDRZm7Jbpxdj7TaoohIA/np6D27Ht37wTISYyP504T+Dbrr0SEL3czigM+B2MDx2c653+1zzOXAPcD6wEMPO+cmBzeqiEjomTQuk+LyKv7fZytJjI3itpP7NFip1+YMvRw41jlXYmbRwAwze885980+x73qnLs2+BFFREKXmXHbyX0oKdtT6i3iopk0rmF2PTpkoTvnHFASuBsd+AqNLbBFRJoAM+NPE/pTWl7FPe8vJTEmkstHdwv669TqQ1EzizSzOcAW4EPn3Mz9HHa2mc0zs2wzyzjAz5loZjlmllNQ0LTWERYRaUgREcY95w7i9EEd6Jya0CCvYXtOwGt5sFkK8CZwnXNuwV6PpwIlzrlyM/sZcL5z7tiD/aysrCyXk5NTx9giIs2TmeU657L299xhTVt0zu0ApgMn7/P4Vufc94sXTAaG1SWoiIjU3SEL3czSA2fmmFk8cAKwZJ9j2u919wxgcTBDiojIodVmlkt74Fkzi2TPL4DXnHPvmNkfgRzn3NvA9WZ2BlAFbAMub6jAIiKyf4c1hh5MGkMXETl8QRtDFxGRpkuFLiISJlToIiJhQoUuIhImvH0oamYFwJo6/udpQGEQ4zQ14fz+9N5CVzi/v1B6b12cc+n7e8JbodeHmeUc6FPecBDO70/vLXSF8/sLl/emIRcRkTChQhcRCROhWuhP+A7QwML5/em9ha5wfn9h8d5CcgxdRET+W6ieoYuIyD5U6CIiYSLkCt3MTjazpWaWZ2a3+c4TLGaWYWbTzWyRmS00sxt8Zwq2wM5Xs83sHd9Zgs3MUgK7dS0xs8VmNsp3pmAxs18E/k4uMLOXAxvHhywze8rMtpjZ3pv0tDazD81seeB7K58Z6yqkCj2whO8jwClAX+AnZtbXb6qgqQJucs71BUYCk8LovX3vBsJ3rfx/ANOcc32AQYTJ+zSzjsD1QJZzrj8QCVzgN1W9PcM+m/QAtwEfO+d6Ah8H7oeckCp0YDiQ55xb6ZyrAF4BJnjOFBTOuY3OuVmB28XsKYSOflMFj5l1Ak5jz45WYcXMWgJjgX8COOcqArt7hYsoIN7MooAEYIPnPPXinPucPfs27G0C8Gzg9rPAmY0aKkhCrdA7Auv2up9PGJXe98ysKzAE2N9m3KHqAeBWoMZ3kAbQDSgAng4MKU02s0TfoYLBObceuBdYC2wEipxzH/hN1SDaOuc2Bm5vAtr6DFNXoVboYc/MkoDXgRudczt95wkGMxsPbHHO5frO0kCigKHAY865IUApIfpP9n0FxpInsOeXVgcg0cwu9puqYbk9c7lDcj53qBX6eiBjr/udAo+FBTOLZk+Zv+ice8N3niAaDZxhZqvZM0x2rJm94DdSUOUD+c657/9Flc2egg8HxwOrnHMFzrlK4A3gKM+ZGsLm7/dGDnzf4jlPnYRaoX8H9DSzbmYWw54PZ972nCkozMzYMwa72Dl3n+88weScu90518k515U9f2afOOfC5izPObcJWGdmvQMPHQcs8hgpmNYCI80sIfB39DjC5APffbwNXBa4fRkw1WOWOqvNJtFNhnOuysyuBd5nz6ftTznnFnqOFSyjgUuA+WY2J/DYHc65dz1mktq7DngxcKKxEvip5zxB4ZybaWbZwCz2zMSaTYhfJm9mLwPHAGlmlg/8DrgbeM3MrmTPst7n+UtYd7r0X0QkTITakIuIiByACl1EJEyo0EVEwoQKXUQkTKjQRUTChApdRCRMqNBFRMLE/wc1wYwtu/phRAAAAABJRU5ErkJggg==\n"
          },
          "metadata": {
            "needs_background": "light"
          }
        }
      ],
      "source": [
        "temperatures = [4.4,5.1,6.1,6.2,6.1,6.1,5.7,5.2,4.7,4.1,3.9,3.5]\n",
        "s4 = pd.Series(temperatures, name=\"Temperature\")\n",
        "s4.plot()\n",
        "plt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "You can easily convert it to Numpy array by dicarding the index. "
      ],
      "metadata": {
        "id": "jiCCsIfv4LoD"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "s4.to_numpy()"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "8FykEnJJ4HQl",
        "outputId": "65fbfab9-2483-4fad-a78b-e7fea6a22bf6"
      },
      "execution_count": 15,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "array([4.4, 5.1, 6.1, 6.2, 6.1, 6.1, 5.7, 5.2, 4.7, 4.1, 3.9, 3.5])"
            ]
          },
          "metadata": {},
          "execution_count": 15
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JLC98XwVbdkO"
      },
      "source": [
        "There are *many* options for plotting your data. It is not necessary to list them all here: if you need a particular type of plot (histograms, pie charts, etc.), just look for it in the excellent [Visualization](http://pandas.pydata.org/pandas-docs/stable/visualization.html) section of pandas' documentation, and look at the example code."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "jwz5rnEsbdkO"
      },
      "source": [
        "### Handling time\n",
        "Many datasets have timestamps, and pandas is awesome at manipulating such data:\n",
        "* it can represent periods (such as 2016Q3) and frequencies (such as \"monthly\"),\n",
        "* it can convert periods to actual timestamps, and *vice versa*,\n",
        "* it can resample data and aggregate values any way you like,\n",
        "* it can handle timezones.\n",
        "\n",
        "#### Time range\n",
        "Let's start by creating a time series using `pd.date_range()`. This returns a `DatetimeIndex` containing one datetime per hour for 12 hours starting on April 23th 2022 at 5:30pm."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 16,
      "metadata": {
        "id": "becHbUssbdkO",
        "outputId": "3e5fc357-5efe-4e83-cf5e-a03c598724d0",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "DatetimeIndex(['2022-04-23 17:30:00', '2022-04-23 18:30:00',\n",
              "               '2022-04-23 19:30:00', '2022-04-23 20:30:00',\n",
              "               '2022-04-23 21:30:00', '2022-04-23 22:30:00',\n",
              "               '2022-04-23 23:30:00', '2022-04-24 00:30:00',\n",
              "               '2022-04-24 01:30:00', '2022-04-24 02:30:00',\n",
              "               '2022-04-24 03:30:00', '2022-04-24 04:30:00'],\n",
              "              dtype='datetime64[ns]', freq='H')"
            ]
          },
          "metadata": {},
          "execution_count": 16
        }
      ],
      "source": [
        "dates = pd.date_range('2022/04/23 5:30pm', periods=12, freq='H')\n",
        "dates"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Zhslg2XVbdkP"
      },
      "source": [
        "This `DatetimeIndex` may be used as an index in a `Series`:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 17,
      "metadata": {
        "id": "ojYFJeizbdkP",
        "outputId": "c6e38d74-043b-457e-b83e-a727554dd093",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "2022-04-23 17:30:00    4.4\n",
              "2022-04-23 18:30:00    5.1\n",
              "2022-04-23 19:30:00    6.1\n",
              "2022-04-23 20:30:00    6.2\n",
              "2022-04-23 21:30:00    6.1\n",
              "2022-04-23 22:30:00    6.1\n",
              "2022-04-23 23:30:00    5.7\n",
              "2022-04-24 00:30:00    5.2\n",
              "2022-04-24 01:30:00    4.7\n",
              "2022-04-24 02:30:00    4.1\n",
              "2022-04-24 03:30:00    3.9\n",
              "2022-04-24 04:30:00    3.5\n",
              "Freq: H, dtype: float64"
            ]
          },
          "metadata": {},
          "execution_count": 17
        }
      ],
      "source": [
        "temp_series = pd.Series(temperatures, dates)\n",
        "temp_series"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "O_vVwZqWbdkP"
      },
      "source": [
        "Let's plot this series:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 18,
      "metadata": {
        "id": "gOa_KJ_JbdkP",
        "outputId": "f0da59fd-db16-43d8-8896-84fd8d7c1b2c",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 360
        }
      },
      "outputs": [
        {
          "output_type": "display_data",
          "data": {
            "text/plain": [
              "<Figure size 432x288 with 1 Axes>"
            ],
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAWoAAAFXCAYAAACLPASQAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAcG0lEQVR4nO3de5CldX3n8feXixEYHEVIYwZxCN7Wdbww7SVFspnBqCheUrWuxl01rKbGrKuSWncFq/ZiKjdjFUZ3K5oQ76trx3hlxesqA2tWwB5ER0UEEZQpBS9cHCUi8t0/nqelaU9Pn26e53d+5znvV9Wp7vM8p8/n+f1O97fP+T3P83siM5Ek1eugSW+AJOnALNSSVDkLtSRVzkItSZWzUEtS5Q7p40mPPvro3Lp167p/7sc//jFHHHFE9xs04SzzzDNvdvI2mrVnz57vZ+YxI1dmZue37du350acf/75G/q52rPMM8+82cnbaBawmKvUVIc+JKlyFmpJqpyFWpIqZ6GWpMpZqCWpchZqSaqchVqSKmehlqTKWaglqXK9nEKuMraedd6q616x7XZOX2X9Na85ra9NktQD31FLUuUs1JJUubGGPiLi3sCbgYcDCbwwMz/X54ZNo6EPRZRu39D7UxrXuGPUbwA+npnPioh7AIf3uE2SpGXWLNQRsRn4F8DpAJl5G3Bbv5slSVoSzTSoB3hAxKOAc4CvAo8E9gBnZOaPVzxuF7ALYG5ubvvCwsK6N2b//v1s2rRp3T+3EX1k7d1386rr5g6D628dvW7bls3mVZB3ICV/N82b7ryNZu3cuXNPZs6PWjdOoZ4HLgJOzsyLI+INwC2Z+V9W+5n5+flcXFxc94bu3r2bHTt2rPvnNqKPrLXGVM/eO/oDTF9juOZ1p+TvpnnTnbfRrIhYtVCPc9THdcB1mXlxe/99wEnr3gpJ0oasWagz87vAtyPiIe2iJ9AMg0iSChj3qI+XAe9uj/i4Gvi3/W2SJGm5sQp1Zl4GjBw7kST1yzMTJalyFmpJqpyFWpIqZ6GWpMo5H7XUchIo1cp31JJUOQu1JFXOQi1JlbNQS1LlLNSSVDkLtSRVzkItSZWzUEtS5SzUklQ5C7UkVc5CLUmVs1BLUuUs1JJUOQu1JFXOQi1JlbNQS1LlvHCANCFeqEDj8h21JFXOQi1JlRv00IcfLSUNwViFOiKuAX4E/By4PTPn+9woSdKd1vOOemdmfr+3LZEkjeQYtSRVLjJz7QdFfBO4EUjgbzPznBGP2QXsApibm9u+sLCw7o3Zv38/mzZtWvfPrWbvvptXXTd3GFx/6+h127ZsNs+8weUdSNd/e7Oct9GsnTt37lltWHncQr0lM/dFxK8CnwJelpkXrvb4+fn5XFxcXPeG7t69mx07dqz751az1s7Es/eOHvnZ6M5E88yrOe9Auv7bm+W8jWZFxKqFeqyhj8zc1369Afgg8Nh1b4UkaUPWLNQRcUREHLn0PfAk4Mt9b5gkqTHOUR9zwAcjYunx/yszP97rVkmSfmHNQp2ZVwOPLLAtkqQRPDxPkipnoZakylmoJalyFmpJqtygZ8+TdCdnk5xevqOWpMpZqCWpchZqSaqchVqSKmehlqTKWaglqXIWakmqnIVakipnoZakyhU/M9GzoyRpfXxHLUmVs1BLUuUs1JJUOQu1JFXOQi1JlbNQS1LlLNSSVDkLtSRVzkItSZUb+8zEiDgYWAT2ZebT+tskSUPgWcjdWc876jOAy/vaEEnSaGMV6og4DjgNeHO/myNJWikyc+0HRbwP+AvgSOA/jhr6iIhdwC6Aubm57QsLCyOfa+++m1fNmTsMrr919LptWzavuZ2TzDLPPPMmm3cg+/fvZ9OmTZ0/b5dZO3fu3JOZ86PWrVmoI+JpwFMz8yURsYNVCvVy8/Pzubi4OHLdWuNWZ+8dPWy+kXGrklnmmWfeZPMOZPfu3ezYsaPz5+0yKyJWLdTjDH2cDDwjIq4BFoBTIuJd694KSdKGrHnUR2a+CngVwLJ31M/rebskaV02cpTJtBxh4nHUklS5dV3hJTN3A7t72RJJ0ki+o5akylmoJalyFmpJqpyFWpIqZ6GWpMpZqCWpchZqSaqchVqSKmehlqTKWaglqXIWakmq3Lrm+pAklb8epO+oJalyFmpJqpyFWpIqZ6GWpMpZqCWpchZqSaqchVqSKmehlqTKWaglqXIWakmqnIVakipnoZakyq1ZqCPinhFxSUR8MSK+EhF/XGLDJEmNcWbP+ylwSmbuj4hDgc9GxMcy86Ket02SxBiFOjMT2N/ePbS9ZZ8bJUm6UzR1eI0HRRwM7AEeCPx1Zp454jG7gF0Ac3Nz2xcWFkY+1959N6+aM3cYXH/r6HXbtmxeczsnmWWeeeZNX15Nbdu5c+eezJwftW6sQv2LB0fcG/gg8LLM/PJqj5ufn8/FxcWR69aacPvsvaPf5G9kwu2SWeaZZ9705dXUtohYtVCv66iPzLwJOB84dT0/J0nauHGO+jimfSdNRBwGPBH4Wt8bJklqjHPUx/2Ad7Tj1AcB783Mj/S7WZKkJeMc9fEl4NEFtkWSNIJnJkpS5SzUklQ5C7UkVc5CLUmVs1BLUuUs1JJUOQu1JFXOQi1JlbNQS1LlLNSSVDkLtSRVzkItSZWzUEtS5SzUklQ5C7UkVc5CLUmVs1BLUuUs1JJUOQu1JFXOQi1JlbNQS1LlLNSSVDkLtSRVzkItSZWzUEtS5dYs1BFx/4g4PyK+GhFfiYgzSmyYJKlxyBiPuR14RWZeGhFHAnsi4lOZ+dWet02SxBjvqDPzO5l5afv9j4DLgS19b5gkqRGZOf6DI7YCFwIPz8xbVqzbBewCmJub276wsDDyOfbuu3nV5587DK6/dfS6bVs2j72dk8gyzzzzpi+vprbt3LlzT2bOj1o3dqGOiE3ABcCfZeYHDvTY+fn5XFxcHLlu61nnrfpzr9h2O2fvHT0ac81rThtrOyeVZZ555k1fXk1ti4hVC/VYR31ExKHA+4F3r1WkJUndGueojwDeAlyema/rf5MkScuN8476ZOD5wCkRcVl7e2rP2yVJaq15eF5mfhaIAtsiSRrBMxMlqXIWakmqnIVakipnoZakylmoJalyFmpJqpyFWpIqZ6GWpMpZqCWpchZqSaqchVqSKmehlqTKWaglqXIWakmqnIVakipnoZakylmoJalyFmpJqpyFWpIqZ6GWpMpZqCWpchZqSaqchVqSKmehlqTKrVmoI+KtEXFDRHy5xAZJku5qnHfUbwdO7Xk7JEmrWLNQZ+aFwA8LbIskaYTIzLUfFLEV+EhmPvwAj9kF7AKYm5vbvrCwMPJxe/fdvGrO3GFw/a2j123bsnnN7ZxklnnmmTd9eTW1befOnXsyc37Uus4K9XLz8/O5uLg4ct3Ws85b9edese12zt57yMh117zmtHGiJ5ZlnnnmTV9eTW2LiFULtUd9SFLlLNSSVLlxDs97D/A54CERcV1EvKj/zZIkLRk9kLJMZj63xIZIkkZz6EOSKmehlqTKWaglqXIWakmqnIVakipnoZakylmoJalyFmpJqpyFWpIqZ6GWpMpZqCWpchZqSaqchVqSKmehlqTKWaglqXIWakmqnIVakipnoZakylmoJalyFmpJqpyFWpIqZ6GWpMpZqCWpchZqSaqchVqSKjdWoY6IUyPiioi4KiLO6nujJEl3WrNQR8TBwF8DTwEeBjw3Ih7W94ZJkhrjvKN+LHBVZl6dmbcBC8Az+90sSdKSyMwDPyDiWcCpmfkH7f3nA4/LzJeueNwuYFd79yHAFRvYnqOB72/g5zaiZJZ55pk3O3kbzXpAZh4zasUhd2977pSZ5wDn3J3niIjFzJzvaJOqyTLPPPNmJ6+PrHGGPvYB9192/7h2mSSpgHEK9eeBB0XECRFxD+D3gHP73SxJ0pI1hz4y8/aIeCnwCeBg4K2Z+ZWetuduDZ1UnGWeeebNTl7nWWvuTJQkTZZnJkpS5SzUklQ5C7UkVa6z46jXKyI2A6cCW9pF+4BPZOZNPWQFzRmWy7MuyZ4G6Eu2rc0bevsG3Z9t5tzyvMy8vses0r8vQ+/P3rMmsjMxIl4A/Dfgk9x5TPZxwBOBP87Md3aY9STgjcCVK7IeCLwkMz/ZVVabV6xtbd7Q2zf0/nwU8DfA5hV5N7V5l3acV7p9g+3Poq9dZha/0Zxefu8Ry+8DfL3jrMuBrSOWnwBcPs1tm5H2Db0/L6OZkmHl8scDXxxA+wbbnyWzJjVGHcCot/J3tOu6dAhw3Yjl+4BDO86Csm2D4bdv6P15RGZevHJhZl4EHNFDXun2Dbk/i2VNaoz6z4BLI+KTwLfbZcfTfJz9k46z3gp8PiIWlmXdn+YMy7d0nAVl2wbDb9/Q+/NjEXEe8M4VeS8APt5DXun2Dbk/i2VN7ISXiLgP8GR+eQfRjT1kPQx4xoqsczPzq11ntXnF2tbmDb19Q+/Pp9BMHbwy76M95ZVu32D7s1TWRM9MLLlnts07CiAzf9hnTptVtG1t5mDbN/T+nITS7Rt6f/ZpUkd9LN9beh3NWGNfe2aPB14LnALc3GbdC/gMcFZmXtNVVptXrG1t3tDbN/T+3Ay8iuZd2RzNePwNwIeB12THhyBOoH2D7c+ir13Xe11r21sKfA54DnDwsmUH04yRXTTNbZuR9g29Pz8BnAkcu2zZscBZwCcH0L7B9mfRrK47aswGXnmAdVcVzFp13TS0zfYNoj+v2Mi6KWrfYPuzZNakjvoouWd2T0S8EXjHiqzfB77QcRaU34s/9PYNvT+vjYhXAu/Idty9HY8/fVl+l0q3b8j9WSxrkkd9lNlb2lzs4EUrsq4D/jfwlsz8aZd5bWbJvc6Dbl/pvNL92R7RchZ3Hee8nubiHH+ZHe94m0D7BtufRbMmVaglSeOZ+Ox57dXLV73fcdbTDnS/h7xibWuff+jtG3p/nnSg+z3klW7fYPuz76yJF2p++TTgPk4LXvKYNe53rWTbYPjtG3p//rs17netdPuG3J+9Zjn0IUmVm+R81E8Gfpe77iD6cGZ2vic/Ih7K6J1Rl3ed1eYVa1ubN/T2Db0/S8+3Xbp9g+3PUlkTGfqIiNcDZwAX0Jy19Nr2+5dHxBs6zjoTWKD5mHxJewvgPRFxVpdZbV6xtrV5Q2/f0PvzBcClwA7g8Pa2k+awthf0kFe6fYPtz6KvXdcHnI95oPjIeYRpXsBOD4IHvg4cOmL5PbrOKt22WWnfwPuz9HzbxX9fhtqfJbMmtTPxnyJi1I6ExwD/1HHWHcCvjVh+v3Zd10q2DYbfvqH3Z+n5tku3b8j9WSxrUmPUpwNviogjuXNS8fvTTNpyesdZfwR8OiKu5K7zGT8QeGnHWVC2bTD89pXOK92fpefbLt2+IfdnsaxJT3N6LHeduvK7PeUcxC9fXPPzmfnzPvLazCJta7MG3b7SeaX7M8rPt126fYPtz1JZkzyF/FiAzPxuRBwD/BbwtexhMvH2F4XMvKM9pfXhwDXZ07y4JdvW5g29fYPuzxH5R/WZNYHfl0H3Z5GsrgfzxxyEfzHwTeAamgPDL6a5LM8VwIs6zvpdmvPvv0NziNDFwKdpPkY/fZrbNiPtG3p/nkxzAdivAI8DPgV8g+aj9G8MoH2D7c+iWV131JgN3EtzKMt9gf2087nS7C29rOOsL9DMEXsCcAvwkHb5A4DFaW7bjLRv6P15CbAN+A3g+8BvtstPAv5xAO0bbH+WzJrUzsSfZeZPgJ9ExDeyHW/MzBsjovOxmKXnj4hvZeYV7bJrlz6Sdaxo29rnHnL7ht6fh2bm3jbve5n52Tbv0og4rIe80u0bcn8Wy5rU4XkZEUuXij9taWFE3JMetmnZL8QLly07mOZYzq4VbVv73ENu39D7c3kbXrViXR95pds35P4sl9X1R48xPzIcDxwyYvkW4Hc6znoMcM8Ry7cCz5vmts1I+4ben88ADh+x/ETglQNo32D7s2SWkzJJUuUmPs1pRJxzoPsdZ736QPd7yCvWtvb5X32g+z3klW7f0Puz9Hzbrz7Q/QHklZzrvtesiRdq4G/XuN+lPWvc71rJtsHw2zf0/iw933bp9g25P3vNcuhDkio3qWlOPxARz4uITQWyfj0i3hoRfxoRmyLi7yLiyxHxDxGxtYe8gyLihRFxXkR8MSIujYiFiNjRdVabd0hEvDgiPh4RX2pvH4uIP1x2tEQRfQxFRMTBbfv+JCJOXrHuP/eQd3hEvDIi/lNE3DMiTo+IcyPitX39vkbEkyPiTW3Oue33p/aRtcZ2/NeenvfJEfGilX9vEfHC0T9xt7IiIp4dEf+q/f4JEfHfI+IlfR1+uCL/M7087yTeUUfEPuBzwCnA/wHeA5yXmbf1kHVh+/ybgecBbwPeCzwJ+DeZeUrHeW8DrqVp17NoDvL/v8CZNJPd/4+O894D3AS8gzsnLToO+H3gqMx8Tsd5R622CvhiZh7Xcd6baU54uQR4PnBBZv6Hdt2lmdnpteki4r00Z5YdBjyE5syzv6fZw39sZj6/47zXAw8G3sldX78X0EwDekaXeWtsy7cy8/iOn/PPgd+kmbf56cDrl/4Genr93gj8Ks3hcbcAv0JzVfDTgOu77M+I+NLKRTSv5dKx4o/oKqvTQ2PWcVjLF9qv96L54/so8D2aIvqkPrLa77+12roO87604v5F7ddfAS7vIW/VeW8PtO5u5P0cuJrmtO6l29L92/rsT5rZHs8BPtD2Zx+v32Xt1wC+y51vZmLla9vn60d/823fssrtR8DtPeTtpT28Erh3+7f+V+39Pl6/ve3XQ4EfAPdY9rvT6etH8w/gXcBDac603ErzT/4BwAO6zJrYCS8AmXlLZv7PzHxq29iLga6v+nBHRDw4mjmND4+IeYCIeCBwcMdZAD+LiBPbjJOA2wAy86eMnrv27vph+zHvF69lO/zyHKCP2deuBnZk5gnLbr+emSfQzOnQtV+cOJCZt2fmLuAy4DNAb0Nn2fwlfrT9unS/j9ev9HzbNwEPysx7rbgdSTMfR9cOyczbAbK5PNXTgXtFxD/QzwkvS1k/o5mhb+nv73Y6nv86M58BvJ/mzcMjM/MamjNpr83Ma7vM6vS/2Tr+E11YMOsJNB9FLqf5CPZ+4CrgBuCZPeSdAnwLuJLmXebj2uXHAK/tIW8rzUfz79FcTePrbdv+Hjihh7x/3/5Sjlr3sh7y3gWcOmL5H7R/FF3nvRnYNGL5icBne8g7ieYNyleBT7a3y4GLgO095P0p8NhV1v1lD3kfAX57le24o4e8j63y+h0LXNJ1XvvcRwCvAz4MXNdHxkwe9RERRwM3Zn/z4QZw38z8fh/Pf4Dc+wJk5g9K5s6KiIjs6Q8mCs/vXUq0c15k5q0j1m3JzH2FtuMI4IjMvKHHjEfSzJr3N50/d22FOiKemJmfGlpWn3kRcS/gmMz8xorlj8jMlTs8zKsvr/RVyM2bsqwaTnhZ6S0DzeolLyKeDXwNeH9EfGXFeOfbzas+r/RVyM2bxqxJvKOOiHNXWwWckplHTGPWhPIuA56Smd+JiMfSHOb1qsz8YER8ITMfbV7VeVfQ7Me4acXy+wAXZ+aDzaszr2TWpOaj/i2aY5r3r1geNNdWm9asSeQdnJnfAcjMSyJiJ/CRiLg//RylYF63Sl+F3LwpzJpUob4I+ElmXrByRftfalqzJpH3o4g4cWk8tX0nuAP4EPDPzas+r/RVyM2bwqzqdiZqfdo9zT/OzKtWLD8UeHZmvtu8evPa5y59FXLzpizLQi1JlavxqA9J0jIWakmqnIVaqkw7R4x5U5jXV9ak5qN+aDRzJp8XESdGxNsj4qaIuCQi/tm0Zpln3gbyTlpx2w6cGxGP7uOP3rzpzOp88pBxbsCFNLNoPZdm7ubfoznu8OnAp6c1yzzzNpB3B/D/gPOX3W5tv37GvHrzimZ13VFjNnD5HNFXrVh36bRmmWfeBvL+JXABzdmQS8u+2XWOedOdNakx6uXzQL9uxbqu56gtmWWeeeuSme+nufrIk6K5PNzx9HMGpHlTnNXLf7Ux/hO9mNFzxj6Q5lI9U5llnnl3M/vRNB+bb+gzx7zpy/KEF6kiERHAkZl5i3nTlddn1sQOz4uyVyYulmWeeXcnLxu3mDcdecWySnwEGfEx4S9o9q6/HvgGyy7hRPc7E4tlmWfeBvL+3LzpzCua1XVHjdnAYlcmLpllnnnmzU5eyaxJDX2UvDJx6asgm2eeebORVyxrUoX6GxHx20t3MvPnmfkimquFd332V8ks88wzb3byimVN6lJcxa5MXDLLPPPMm528klkTeUedmbeubFxEvLpd1+kLVzLLPPPMm528klk1zZ73jIFmmWeeebOT10tWTYW6jwtd1pBlnnnmzU5eL1nVnJkYEQdl5h1DyzLPPPNmJ6+vrJreUX9toFnmmWfe7OT1kjWpoz5+xJ2zTC19VDgc+AmQmXmvacwyzzzzZievZNak3lG/DfgQ8KDMPDIzjwS+1X7f6QtXOMs888ybnbxyWdnxKZzj3oDtwGeAl9P8w7h6CFnmmWfe7OSVyprYGHVm7gF+p717AXDPIWSZZ555s5NXKquKoz4i4n7AozPzo0PKMs8882Ynr8+sQ7p+wnFFxEOBZwJb2kX7IuKbmXn5NGeZZ555s5NXKmsiQx8RcSawQLOn9JL2FsB7IuKsac0yzzzzZievaNv6HNQ/wAD814FDRyy/B3DltGaZZ555s5NXMmtSOxPvAH5txPL7teumNcs888ybnbxiWZMao/4j4NMRcSXw7XbZ8TRXen7pFGeZZ555s5NXLGtiR31ExEHAY1k2CA98PjN/Ps1Z5pln3uzkFcvqeozoboz37BpilnnmmTc7eX1l1TQp0x8ONMs888ybnbxesmoq1FM/Z6x55pk383mDn4/6uMy8bmhZ5pln3uzk9ZU1sXfUEfHQiHhCRGwCWGpcRJw6zVnmmWfe7OQVyyo5qL9swP3lNJdU/xBwDfDMZesundYs88wzb3byimZ13VFjNnAvsKn9fiuwCJzR3v/CtGaZZ555s5NXMmtSJ7wclJn7ATLzmojYAbwvIh5A94PxJbPMM8+82ckrljWpMerrI+JRS3faxj4NOBrYNsVZ5pln3uzkFcua1DUTjwNuz8zvjlh3cmb+4zRmmWeeebOTVzRrEoVakjS+Sc1HvS0iLoqIb0fEORFxn2XrLpnWLPPMM2928kpmTWqM+k3Aq2nGcb4OfDYiTmzXHTrFWeaZZ97s5JXL6vrwmDEPa/niivs7gSuBx9P9sY7Fsswzz7zZySua1XVHjdtAYPOKZY9oG/mDac0yzzzzZievaFbXHTVmA/818PgRy48H/m5as8wzz7zZySuZ5VEfklS5SR31sTkiXhMRX4uIH0bEDyLi8nbZvac1yzzzzJudvJJZkzrq473AjcCOzDwqM+9LMxB/Y7tuWrPMM8+82ckrl9X1GNGYYztXbGRd7VnmmWfe7OSVzJrUO+prI+KVETG3tCAi5iLiTO68mu80Zplnnnmzk1csa1KF+jnAfYEL2rGdHwK7gaOAZ09xlnnmmTc7ecWyPOpDkipXw6W4jlixvM/L5fSeZZ555s1OXrGsrgfzxxyEH+Tlcswzz7zZySua1XVHjdnAQV4uxzzzzJudvJJZXoqre+aZZ95s5HkprinNMs8882Ynz0txTWOWeeaZNzt5RbMmUaglSeOb2OF5kqTxWKglqXIWakmqnIVakir3/wHhmwVn4PFWWAAAAABJRU5ErkJggg==\n"
          },
          "metadata": {
            "needs_background": "light"
          }
        }
      ],
      "source": [
        "temp_series.plot(kind=\"bar\")\n",
        "\n",
        "plt.grid(True)\n",
        "plt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mh0jD10ZbdkU"
      },
      "source": [
        "### Periods\n",
        "The `pd.period_range()` function returns a `PeriodIndex` instead of a `DatetimeIndex`. For example, let's get all quarters in 2016 and 2017:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 19,
      "metadata": {
        "id": "MuvcVUi7bdkU",
        "outputId": "9bed4376-f3ce-4d25-d04c-1f7a9c2c1878",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "PeriodIndex(['2021Q1', '2021Q2', '2021Q3', '2021Q4', '2022Q1', '2022Q2',\n",
              "             '2022Q3', '2022Q4'],\n",
              "            dtype='period[Q-DEC]')"
            ]
          },
          "metadata": {},
          "execution_count": 19
        }
      ],
      "source": [
        "quarters = pd.period_range('2021Q1', periods=8, freq='Q')\n",
        "quarters"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "s0U-6hwvbdkU"
      },
      "source": [
        "Adding a number `N` to a `PeriodIndex` shifts the periods by `N` times the `PeriodIndex`'s frequency:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 20,
      "metadata": {
        "id": "msf61zGDbdkV",
        "outputId": "110697a5-be13-45f3-82ea-ad078c52e07c",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "PeriodIndex(['2021Q4', '2022Q1', '2022Q2', '2022Q3', '2022Q4', '2023Q1',\n",
              "             '2023Q2', '2023Q3'],\n",
              "            dtype='period[Q-DEC]')"
            ]
          },
          "metadata": {},
          "execution_count": 20
        }
      ],
      "source": [
        "quarters + 3"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "svb-1SlLbdkW"
      },
      "source": [
        "Pandas also provides many other time-related functions that we recommend you check out in the [documentation](http://pandas.pydata.org/pandas-docs/stable/timeseries.html)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_3B8znu4bdkX"
      },
      "source": [
        "### `DataFrame` objects\n",
        "A DataFrame object represents a spreadsheet, with cell values, column names and row index labels. You can define expressions to compute columns based on other columns, create pivot-tables, group rows, draw graphs, etc. You can see `DataFrame`s as dictionaries of `Series`."
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "\n",
        "#### Creating a `DataFrame`\n",
        "You can create a DataFrame by passing a dictionary of `Series` objects:"
      ],
      "metadata": {
        "id": "ge2vV_g9nKT3"
      }
    },
    {
      "cell_type": "code",
      "execution_count": 69,
      "metadata": {
        "id": "YYuXxk5IbdkX",
        "outputId": "00fad6dd-7685-45ea-81db-71e635dd284f",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 143
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         weight  birthyear  children    hobby\n",
              "alice        68       1985       NaN   Biking\n",
              "bob          83       1984       3.0  Dancing\n",
              "charles     112       1992       0.0      NaN"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-866f3739-ff1b-4cb7-b05b-069871e8f603\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>weight</th>\n",
              "      <th>birthyear</th>\n",
              "      <th>children</th>\n",
              "      <th>hobby</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>68</td>\n",
              "      <td>1985</td>\n",
              "      <td>NaN</td>\n",
              "      <td>Biking</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>83</td>\n",
              "      <td>1984</td>\n",
              "      <td>3.0</td>\n",
              "      <td>Dancing</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>112</td>\n",
              "      <td>1992</td>\n",
              "      <td>0.0</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-866f3739-ff1b-4cb7-b05b-069871e8f603')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-866f3739-ff1b-4cb7-b05b-069871e8f603 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-866f3739-ff1b-4cb7-b05b-069871e8f603');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 69
        }
      ],
      "source": [
        "people_dict = {\n",
        "    \"weight\": pd.Series([68, 83, 112], index=[\"alice\", \"bob\", \"charles\"]),\n",
        "    \"birthyear\": pd.Series([1984, 1985, 1992], index=[\"bob\", \"alice\", \"charles\"], name=\"year\"),\n",
        "    \"children\": pd.Series([0, 3], index=[\"charles\", \"bob\"]),\n",
        "    \"hobby\": pd.Series([\"Biking\", \"Dancing\"], index=[\"alice\", \"bob\"]),\n",
        "}\n",
        "people = pd.DataFrame(people_dict)\n",
        "people"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "q5PqT_JjbdkX"
      },
      "source": [
        "A few things to note:\n",
        "* the `Series` were automatically aligned based on their index,\n",
        "* missing values are represented as `NaN`,\n",
        "* `Series` names are ignored (the name `\"year\"` was dropped),\n",
        "* `DataFrame`s are displayed nicely in Jupyter notebooks!"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Z7-ImAbLbdkX"
      },
      "source": [
        "You can access columns pretty much as you would expect. They are returned as `Series` objects:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 70,
      "metadata": {
        "id": "Wzc_2C8fbdkX",
        "outputId": "4cc1bd71-c1e5-4456-b996-912beaae64cd",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "alice      1985\n",
              "bob        1984\n",
              "charles    1992\n",
              "Name: birthyear, dtype: int64"
            ]
          },
          "metadata": {},
          "execution_count": 70
        }
      ],
      "source": [
        "people[\"birthyear\"]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "EorAvWrRbdkX"
      },
      "source": [
        "You can also get multiple columns at once:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 71,
      "metadata": {
        "id": "1MlqI9oLbdkX",
        "outputId": "011f6209-92cc-4680-fb59-714a0ee4e4cf",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 143
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         birthyear    hobby\n",
              "alice         1985   Biking\n",
              "bob           1984  Dancing\n",
              "charles       1992      NaN"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-05ac5bf6-0f9b-4603-a133-feb858c6ee08\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>birthyear</th>\n",
              "      <th>hobby</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>1985</td>\n",
              "      <td>Biking</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>1984</td>\n",
              "      <td>Dancing</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>1992</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-05ac5bf6-0f9b-4603-a133-feb858c6ee08')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-05ac5bf6-0f9b-4603-a133-feb858c6ee08 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-05ac5bf6-0f9b-4603-a133-feb858c6ee08');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 71
        }
      ],
      "source": [
        "people[[\"birthyear\", \"hobby\"]]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "BR_u72S8bdkY"
      },
      "source": [
        "Another convenient way to create a `DataFrame` is to pass all the values to the constructor as an `ndarray`, or a list of lists, and specify the column names and row index labels separately:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 72,
      "metadata": {
        "id": "7W8qDSb0bdkY",
        "outputId": "379196e3-c6aa-4ad8-8eee-f9a047905e9f",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 143
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         birthyear  children    hobby  weight\n",
              "alice         1985       NaN   Biking      68\n",
              "bob           1984       3.0  Dancing      83\n",
              "charles       1992       0.0      NaN     112"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-43b3b441-aa45-4797-a618-6ca6cae84a93\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>birthyear</th>\n",
              "      <th>children</th>\n",
              "      <th>hobby</th>\n",
              "      <th>weight</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>1985</td>\n",
              "      <td>NaN</td>\n",
              "      <td>Biking</td>\n",
              "      <td>68</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>1984</td>\n",
              "      <td>3.0</td>\n",
              "      <td>Dancing</td>\n",
              "      <td>83</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>1992</td>\n",
              "      <td>0.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>112</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-43b3b441-aa45-4797-a618-6ca6cae84a93')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-43b3b441-aa45-4797-a618-6ca6cae84a93 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-43b3b441-aa45-4797-a618-6ca6cae84a93');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 72
        }
      ],
      "source": [
        "values = [\n",
        "            [1985, np.nan, \"Biking\",   68],\n",
        "            [1984, 3,      \"Dancing\",  83],\n",
        "            [1992, 0,      np.nan,    112]\n",
        "         ]\n",
        "d3 = pd.DataFrame(\n",
        "        values,\n",
        "        columns=[\"birthyear\", \"children\", \"hobby\", \"weight\"],\n",
        "        index=[\"alice\", \"bob\", \"charles\"]\n",
        "     )\n",
        "d3"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "kV6Iyx6KbdkY"
      },
      "source": [
        "To specify missing values, you can either use `np.nan` or NumPy's masked arrays:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 73,
      "metadata": {
        "id": "7eq0bxHWbdkY",
        "outputId": "baa21e0a-d0d9-4c4b-e789-ff145155aa2b",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 216
        }
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stderr",
          "text": [
            "/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:1: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe. \n",
            "Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations\n",
            "  \"\"\"Entry point for launching an IPython kernel.\n"
          ]
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "        birthyear children    hobby weight\n",
              "alice        1985      NaN   Biking     68\n",
              "bob          1984        3  Dancing     83\n",
              "charles      1992        0      NaN    112"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-04b781e6-a4db-4948-9c11-97855b36766a\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>birthyear</th>\n",
              "      <th>children</th>\n",
              "      <th>hobby</th>\n",
              "      <th>weight</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>1985</td>\n",
              "      <td>NaN</td>\n",
              "      <td>Biking</td>\n",
              "      <td>68</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>1984</td>\n",
              "      <td>3</td>\n",
              "      <td>Dancing</td>\n",
              "      <td>83</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>1992</td>\n",
              "      <td>0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>112</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-04b781e6-a4db-4948-9c11-97855b36766a')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-04b781e6-a4db-4948-9c11-97855b36766a button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-04b781e6-a4db-4948-9c11-97855b36766a');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 73
        }
      ],
      "source": [
        "masked_array = np.ma.asarray(values, dtype=np.object)\n",
        "masked_array[(0, 2), (1, 2)] = np.ma.masked\n",
        "d3 = pd.DataFrame(\n",
        "        masked_array,\n",
        "        columns=[\"birthyear\", \"children\", \"hobby\", \"weight\"],\n",
        "        index=[\"alice\", \"bob\", \"charles\"]\n",
        "     )\n",
        "d3"
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "You can also create multi-index datafram as follows:"
      ],
      "metadata": {
        "id": "1hA9uEAiqUlg"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "df = pd.DataFrame(\n",
        "  {\"a\" : [4 ,5, 6],\n",
        "  \"b\" : [7, 8, 9],\n",
        "  \"c\" : [10, 11, 12]},\n",
        "index = pd.MultiIndex.from_tuples(\n",
        "  [('d',1),('d',2),('e',2)], names=['n','v'])\n",
        ")\n",
        "df"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 175
        },
        "id": "xsBbZelxn8tC",
        "outputId": "951bab2e-0546-4d81-e33c-ec1cc995a215"
      },
      "execution_count": 74,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "     a  b   c\n",
              "n v          \n",
              "d 1  4  7  10\n",
              "  2  5  8  11\n",
              "e 2  6  9  12"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-cfd03cf6-991a-40b8-88d0-6acc1f5f2625\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th></th>\n",
              "      <th>a</th>\n",
              "      <th>b</th>\n",
              "      <th>c</th>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>n</th>\n",
              "      <th>v</th>\n",
              "      <th></th>\n",
              "      <th></th>\n",
              "      <th></th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th rowspan=\"2\" valign=\"top\">d</th>\n",
              "      <th>1</th>\n",
              "      <td>4</td>\n",
              "      <td>7</td>\n",
              "      <td>10</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>5</td>\n",
              "      <td>8</td>\n",
              "      <td>11</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>e</th>\n",
              "      <th>2</th>\n",
              "      <td>6</td>\n",
              "      <td>9</td>\n",
              "      <td>12</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-cfd03cf6-991a-40b8-88d0-6acc1f5f2625')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-cfd03cf6-991a-40b8-88d0-6acc1f5f2625 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-cfd03cf6-991a-40b8-88d0-6acc1f5f2625');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 74
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "If all columns are tuples of the same size, then they are understood as a multi-index. The same goes for row index labels. For example:"
      ],
      "metadata": {
        "id": "ZJG34ozirdgh"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "d5 = pd.DataFrame(\n",
        "  {\n",
        "    (\"public\", \"birthyear\"):\n",
        "        {(\"Paris\",\"alice\"):1985, (\"Paris\",\"bob\"): 1984, (\"London\",\"charles\"): 1992},\n",
        "    (\"public\", \"hobby\"):\n",
        "        {(\"Paris\",\"alice\"):\"Biking\", (\"Paris\",\"bob\"): \"Dancing\"},\n",
        "    (\"private\", \"weight\"):\n",
        "        {(\"Paris\",\"alice\"):68, (\"Paris\",\"bob\"): 83, (\"London\",\"charles\"): 112},\n",
        "    (\"private\", \"children\"):\n",
        "        {(\"Paris\", \"alice\"):np.nan, (\"Paris\",\"bob\"): 3, (\"London\",\"charles\"): 0}\n",
        "  }\n",
        ")\n",
        "d5"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 175
        },
        "id": "sge2O2qvrDop",
        "outputId": "a1d196e1-05d4-4676-f82b-8a172fbaf331"
      },
      "execution_count": 75,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "                  public          private         \n",
              "               birthyear    hobby  weight children\n",
              "Paris  alice        1985   Biking      68      NaN\n",
              "       bob          1984  Dancing      83      3.0\n",
              "London charles      1992      NaN     112      0.0"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-640277e0-5d73-4e45-ac98-421de6fdb933\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead tr th {\n",
              "        text-align: left;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr>\n",
              "      <th></th>\n",
              "      <th></th>\n",
              "      <th colspan=\"2\" halign=\"left\">public</th>\n",
              "      <th colspan=\"2\" halign=\"left\">private</th>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th></th>\n",
              "      <th></th>\n",
              "      <th>birthyear</th>\n",
              "      <th>hobby</th>\n",
              "      <th>weight</th>\n",
              "      <th>children</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th rowspan=\"2\" valign=\"top\">Paris</th>\n",
              "      <th>alice</th>\n",
              "      <td>1985</td>\n",
              "      <td>Biking</td>\n",
              "      <td>68</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>1984</td>\n",
              "      <td>Dancing</td>\n",
              "      <td>83</td>\n",
              "      <td>3.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>London</th>\n",
              "      <th>charles</th>\n",
              "      <td>1992</td>\n",
              "      <td>NaN</td>\n",
              "      <td>112</td>\n",
              "      <td>0.0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-640277e0-5d73-4e45-ac98-421de6fdb933')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-640277e0-5d73-4e45-ac98-421de6fdb933 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-640277e0-5d73-4e45-ac98-421de6fdb933');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 75
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "You can now get a DataFrame containing all the \"public\" columns very simply:"
      ],
      "metadata": {
        "id": "EeN5brxLrmHy"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "d5[\"public\"]"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 143
        },
        "id": "m4yGqoUOrlnZ",
        "outputId": "98c6824f-bae8-40a0-f790-0188ad4144cb"
      },
      "execution_count": 76,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "                birthyear    hobby\n",
              "Paris  alice         1985   Biking\n",
              "       bob           1984  Dancing\n",
              "London charles       1992      NaN"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-f84540d2-337f-46ee-a42a-4be6d973f614\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th></th>\n",
              "      <th>birthyear</th>\n",
              "      <th>hobby</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th rowspan=\"2\" valign=\"top\">Paris</th>\n",
              "      <th>alice</th>\n",
              "      <td>1985</td>\n",
              "      <td>Biking</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>1984</td>\n",
              "      <td>Dancing</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>London</th>\n",
              "      <th>charles</th>\n",
              "      <td>1992</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-f84540d2-337f-46ee-a42a-4be6d973f614')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-f84540d2-337f-46ee-a42a-4be6d973f614 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-f84540d2-337f-46ee-a42a-4be6d973f614');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 76
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "IT41nGg7bdkc"
      },
      "source": [
        "It is noted that most methods return modified copies in pandas."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "HHUt1_d1bdkc"
      },
      "source": [
        "#### Subsets - Accessing rows\n",
        "Let's go back to the `people` `DataFrame`:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 77,
      "metadata": {
        "id": "75UxPZPAbdkc",
        "outputId": "a04a8e89-6897-437b-fa44-8339b8312c3d",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 143
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         weight  birthyear  children    hobby\n",
              "alice        68       1985       NaN   Biking\n",
              "bob          83       1984       3.0  Dancing\n",
              "charles     112       1992       0.0      NaN"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-d7ea7f11-6371-414f-b158-7ca3d0f03aa5\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>weight</th>\n",
              "      <th>birthyear</th>\n",
              "      <th>children</th>\n",
              "      <th>hobby</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>68</td>\n",
              "      <td>1985</td>\n",
              "      <td>NaN</td>\n",
              "      <td>Biking</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>83</td>\n",
              "      <td>1984</td>\n",
              "      <td>3.0</td>\n",
              "      <td>Dancing</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>112</td>\n",
              "      <td>1992</td>\n",
              "      <td>0.0</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-d7ea7f11-6371-414f-b158-7ca3d0f03aa5')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-d7ea7f11-6371-414f-b158-7ca3d0f03aa5 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-d7ea7f11-6371-414f-b158-7ca3d0f03aa5');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 77
        }
      ],
      "source": [
        "people"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "86z3de-Ibdkc"
      },
      "source": [
        "**The `loc` attribute lets you access rows instead of columns.** The result is a `Series` object in which the `DataFrame`'s column names are mapped to row index labels:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 78,
      "metadata": {
        "id": "2ii7IFnIbdkc",
        "outputId": "ff1e27cb-8f0a-4287-aefc-043155dbe148",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "weight        112\n",
              "birthyear    1992\n",
              "children      0.0\n",
              "hobby         NaN\n",
              "Name: charles, dtype: object"
            ]
          },
          "metadata": {},
          "execution_count": 78
        }
      ],
      "source": [
        "people.loc[\"charles\"]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "M9n8cJsDbdkd"
      },
      "source": [
        "You can also access rows by integer location using the `iloc` attribute:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 79,
      "metadata": {
        "id": "u2T-r9f2bdkd",
        "outputId": "97197019-38cd-47ae-da80-ee5aedbd6fca",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "weight        112\n",
              "birthyear    1992\n",
              "children      0.0\n",
              "hobby         NaN\n",
              "Name: charles, dtype: object"
            ]
          },
          "metadata": {},
          "execution_count": 79
        }
      ],
      "source": [
        "people.iloc[2]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "h9wxoGC2bdkd"
      },
      "source": [
        "You can also get a slice of rows, and this returns a `DataFrame` object:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 80,
      "metadata": {
        "id": "PUAGAfWmbdkd",
        "outputId": "aa7df8c9-05f4-4c0f-ce2b-d5a079004993",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 112
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         weight  birthyear  children    hobby\n",
              "bob          83       1984       3.0  Dancing\n",
              "charles     112       1992       0.0      NaN"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-e2bedf2e-d6af-49af-86b0-90698b401379\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>weight</th>\n",
              "      <th>birthyear</th>\n",
              "      <th>children</th>\n",
              "      <th>hobby</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>83</td>\n",
              "      <td>1984</td>\n",
              "      <td>3.0</td>\n",
              "      <td>Dancing</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>112</td>\n",
              "      <td>1992</td>\n",
              "      <td>0.0</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-e2bedf2e-d6af-49af-86b0-90698b401379')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-e2bedf2e-d6af-49af-86b0-90698b401379 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-e2bedf2e-d6af-49af-86b0-90698b401379');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 80
        }
      ],
      "source": [
        "people.iloc[1:3]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "agVv1ZEKbdkd"
      },
      "source": [
        "Finally, you can pass a boolean array to get the matching rows. This is most useful when combined with boolean expressions:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 81,
      "metadata": {
        "id": "Uh00Rgg1bdkd",
        "outputId": "0b5abc4e-0f18-408f-bdc5-3bbbdbc7534c",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 112
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "       weight  birthyear  children    hobby\n",
              "alice      68       1985       NaN   Biking\n",
              "bob        83       1984       3.0  Dancing"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-11d05141-79c0-4987-b97d-46d02e5abea6\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>weight</th>\n",
              "      <th>birthyear</th>\n",
              "      <th>children</th>\n",
              "      <th>hobby</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>68</td>\n",
              "      <td>1985</td>\n",
              "      <td>NaN</td>\n",
              "      <td>Biking</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>83</td>\n",
              "      <td>1984</td>\n",
              "      <td>3.0</td>\n",
              "      <td>Dancing</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-11d05141-79c0-4987-b97d-46d02e5abea6')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-11d05141-79c0-4987-b97d-46d02e5abea6 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-11d05141-79c0-4987-b97d-46d02e5abea6');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 81
        }
      ],
      "source": [
        "people[people[\"birthyear\"] < 1990]"
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "You can also accessing columns by specifiying the second axis:"
      ],
      "metadata": {
        "id": "KVvkXiAksSd6"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "people.iloc[:,2]"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "i4uSRaCfsaGp",
        "outputId": "fd5f66c2-6341-4b3d-993d-3ae570415c39"
      },
      "execution_count": 82,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "alice      NaN\n",
              "bob        3.0\n",
              "charles    0.0\n",
              "Name: children, dtype: float64"
            ]
          },
          "metadata": {},
          "execution_count": 82
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "KuXwobq6bdke"
      },
      "source": [
        "#### Adding and removing columns\n",
        "You can generally treat `DataFrame` objects like dictionaries of `Series`, so the following work fine:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 83,
      "metadata": {
        "id": "9n9f6-N_bdke",
        "outputId": "000fb941-5b79-4b15-dfd4-303559a1b9d0",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 143
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         weight  birthyear  children    hobby\n",
              "alice        68       1985       NaN   Biking\n",
              "bob          83       1984       3.0  Dancing\n",
              "charles     112       1992       0.0      NaN"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-068aa998-76c9-45c0-9534-ab135acfa8fd\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>weight</th>\n",
              "      <th>birthyear</th>\n",
              "      <th>children</th>\n",
              "      <th>hobby</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>68</td>\n",
              "      <td>1985</td>\n",
              "      <td>NaN</td>\n",
              "      <td>Biking</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>83</td>\n",
              "      <td>1984</td>\n",
              "      <td>3.0</td>\n",
              "      <td>Dancing</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>112</td>\n",
              "      <td>1992</td>\n",
              "      <td>0.0</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-068aa998-76c9-45c0-9534-ab135acfa8fd')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-068aa998-76c9-45c0-9534-ab135acfa8fd button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-068aa998-76c9-45c0-9534-ab135acfa8fd');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 83
        }
      ],
      "source": [
        "people"
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "people[\"age\"] = 2018 - people[\"birthyear\"]  # adds a new column \"age\"\n",
        "people[\"over 30\"] = people[\"age\"] > 30      # adds another column \"over 30\"\n",
        "birthyears = people.pop(\"birthyear\")\n",
        "people.drop(columns=['children'], inplace=True) # drop a column inplace\n",
        "people"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 143
        },
        "id": "Y-qN7BgDtlh9",
        "outputId": "9bf41a39-00b6-49d1-fc11-cb1ccbb788eb"
      },
      "execution_count": 90,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         weight    hobby  age  over 30\n",
              "alice        68   Biking   33     True\n",
              "bob          83  Dancing   34     True\n",
              "charles     112      NaN   26    False"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-f39505dd-26fa-4f82-9661-428bfec32b10\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>weight</th>\n",
              "      <th>hobby</th>\n",
              "      <th>age</th>\n",
              "      <th>over 30</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>68</td>\n",
              "      <td>Biking</td>\n",
              "      <td>33</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>83</td>\n",
              "      <td>Dancing</td>\n",
              "      <td>34</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>112</td>\n",
              "      <td>NaN</td>\n",
              "      <td>26</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-f39505dd-26fa-4f82-9661-428bfec32b10')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-f39505dd-26fa-4f82-9661-428bfec32b10 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-f39505dd-26fa-4f82-9661-428bfec32b10');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 90
        }
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 91,
      "metadata": {
        "id": "CmdxWxvqbdke",
        "outputId": "fc05e249-cc43-4d9c-f594-3ce9a7d73752",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "alice      1985\n",
              "bob        1984\n",
              "charles    1992\n",
              "Name: birthyear, dtype: int64"
            ]
          },
          "metadata": {},
          "execution_count": 91
        }
      ],
      "source": [
        "birthyears"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "1_VwIo7cbdke"
      },
      "source": [
        "When you add a new column, it must have the same number of rows. Missing rows are filled with NaN, and extra rows are ignored:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 92,
      "metadata": {
        "id": "4xPE-3XQbdke",
        "outputId": "c06cbfba-1c16-43f2-d6d1-d05ec66827e2",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 143
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         weight    hobby  age  over 30  pets\n",
              "alice        68   Biking   33     True   NaN\n",
              "bob          83  Dancing   34     True   0.0\n",
              "charles     112      NaN   26    False   5.0"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-452e7378-dae7-43cd-a6e8-d8dc14c573b9\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>weight</th>\n",
              "      <th>hobby</th>\n",
              "      <th>age</th>\n",
              "      <th>over 30</th>\n",
              "      <th>pets</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>68</td>\n",
              "      <td>Biking</td>\n",
              "      <td>33</td>\n",
              "      <td>True</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>83</td>\n",
              "      <td>Dancing</td>\n",
              "      <td>34</td>\n",
              "      <td>True</td>\n",
              "      <td>0.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>112</td>\n",
              "      <td>NaN</td>\n",
              "      <td>26</td>\n",
              "      <td>False</td>\n",
              "      <td>5.0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-452e7378-dae7-43cd-a6e8-d8dc14c573b9')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-452e7378-dae7-43cd-a6e8-d8dc14c573b9 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-452e7378-dae7-43cd-a6e8-d8dc14c573b9');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 92
        }
      ],
      "source": [
        "people[\"pets\"] = pd.Series({\"bob\": 0, \"charles\": 5, \"eugene\":1})  # alice is missing, eugene is ignored\n",
        "people"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qmOZF02Ebdkf"
      },
      "source": [
        "When adding a new column, it is added at the end (on the right) by default. You can also insert a column anywhere else using the `insert()` method:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 93,
      "metadata": {
        "id": "GEIn8gb1bdkf",
        "outputId": "dc73568e-b163-4831-ff8e-dec3cdf9eec3",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 143
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         weight  height    hobby  age  over 30  pets\n",
              "alice        68     172   Biking   33     True   NaN\n",
              "bob          83     181  Dancing   34     True   0.0\n",
              "charles     112     185      NaN   26    False   5.0"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-c7d21891-1956-4cae-bf61-336de821f91f\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>weight</th>\n",
              "      <th>height</th>\n",
              "      <th>hobby</th>\n",
              "      <th>age</th>\n",
              "      <th>over 30</th>\n",
              "      <th>pets</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>68</td>\n",
              "      <td>172</td>\n",
              "      <td>Biking</td>\n",
              "      <td>33</td>\n",
              "      <td>True</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>83</td>\n",
              "      <td>181</td>\n",
              "      <td>Dancing</td>\n",
              "      <td>34</td>\n",
              "      <td>True</td>\n",
              "      <td>0.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>112</td>\n",
              "      <td>185</td>\n",
              "      <td>NaN</td>\n",
              "      <td>26</td>\n",
              "      <td>False</td>\n",
              "      <td>5.0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-c7d21891-1956-4cae-bf61-336de821f91f')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-c7d21891-1956-4cae-bf61-336de821f91f button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-c7d21891-1956-4cae-bf61-336de821f91f');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 93
        }
      ],
      "source": [
        "people.insert(1, \"height\", [172, 181, 185])\n",
        "people"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "YAQ4_fffbdkf"
      },
      "source": [
        "You can also create new columns by calling the `assign()` method. Note that this returns a new `DataFrame` object, the original is not modified"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 103,
      "metadata": {
        "id": "qEDE3MLTbdkf",
        "outputId": "46f069d4-6c3d-4906-e7ef-b164b635d223",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 143
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         weight  height    hobby  age  over 30  pets        bmi  has_pets\n",
              "alice        68     172   Biking   33     True   NaN  22.985398     False\n",
              "bob          83     181  Dancing   34     True   0.0  25.335002     False\n",
              "charles     112     185      NaN   26    False   5.0  32.724617      True"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-921d15af-18a7-406d-8899-f70537afcb51\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>weight</th>\n",
              "      <th>height</th>\n",
              "      <th>hobby</th>\n",
              "      <th>age</th>\n",
              "      <th>over 30</th>\n",
              "      <th>pets</th>\n",
              "      <th>bmi</th>\n",
              "      <th>has_pets</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>68</td>\n",
              "      <td>172</td>\n",
              "      <td>Biking</td>\n",
              "      <td>33</td>\n",
              "      <td>True</td>\n",
              "      <td>NaN</td>\n",
              "      <td>22.985398</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>83</td>\n",
              "      <td>181</td>\n",
              "      <td>Dancing</td>\n",
              "      <td>34</td>\n",
              "      <td>True</td>\n",
              "      <td>0.0</td>\n",
              "      <td>25.335002</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>112</td>\n",
              "      <td>185</td>\n",
              "      <td>NaN</td>\n",
              "      <td>26</td>\n",
              "      <td>False</td>\n",
              "      <td>5.0</td>\n",
              "      <td>32.724617</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-921d15af-18a7-406d-8899-f70537afcb51')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-921d15af-18a7-406d-8899-f70537afcb51 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-921d15af-18a7-406d-8899-f70537afcb51');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 103
        }
      ],
      "source": [
        "p2 = people.assign(\n",
        "    bmi = people[\"weight\"] / (people[\"height\"] / 100) ** 2,\n",
        "    has_pets = people[\"pets\"] > 0\n",
        ")\n",
        "p2"
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "You can also rename the column name:"
      ],
      "metadata": {
        "id": "li0DlR9hvHde"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "p2.rename(columns={'bmi':'body_mass_index'})"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 143
        },
        "id": "WuV0yEpEuJD9",
        "outputId": "a0246610-cc9b-4441-e5ea-392b49b9035a"
      },
      "execution_count": 104,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         weight  height    hobby  age  over 30  pets  body_mass_index  \\\n",
              "alice        68     172   Biking   33     True   NaN        22.985398   \n",
              "bob          83     181  Dancing   34     True   0.0        25.335002   \n",
              "charles     112     185      NaN   26    False   5.0        32.724617   \n",
              "\n",
              "         has_pets  \n",
              "alice       False  \n",
              "bob         False  \n",
              "charles      True  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-702fd202-9875-411b-9783-8d808830a0a4\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>weight</th>\n",
              "      <th>height</th>\n",
              "      <th>hobby</th>\n",
              "      <th>age</th>\n",
              "      <th>over 30</th>\n",
              "      <th>pets</th>\n",
              "      <th>body_mass_index</th>\n",
              "      <th>has_pets</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>68</td>\n",
              "      <td>172</td>\n",
              "      <td>Biking</td>\n",
              "      <td>33</td>\n",
              "      <td>True</td>\n",
              "      <td>NaN</td>\n",
              "      <td>22.985398</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>83</td>\n",
              "      <td>181</td>\n",
              "      <td>Dancing</td>\n",
              "      <td>34</td>\n",
              "      <td>True</td>\n",
              "      <td>0.0</td>\n",
              "      <td>25.335002</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>112</td>\n",
              "      <td>185</td>\n",
              "      <td>NaN</td>\n",
              "      <td>26</td>\n",
              "      <td>False</td>\n",
              "      <td>5.0</td>\n",
              "      <td>32.724617</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-702fd202-9875-411b-9783-8d808830a0a4')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-702fd202-9875-411b-9783-8d808830a0a4 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-702fd202-9875-411b-9783-8d808830a0a4');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 104
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "I2Oe3W7zbdkg"
      },
      "source": [
        "#### Evaluating an expression\n",
        "A great feature supported by pandas is expression evaluation. This relies on the `numexpr` library which must be installed."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 105,
      "metadata": {
        "id": "_bo42tAMbdkg",
        "outputId": "888705bc-1db3-4c48-ab9e-4ebbc7c8de81",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "alice      False\n",
              "bob         True\n",
              "charles     True\n",
              "dtype: bool"
            ]
          },
          "metadata": {},
          "execution_count": 105
        }
      ],
      "source": [
        "people.eval(\"weight / (height/100) ** 2 > 25\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "BHvHS0IQbdkh"
      },
      "source": [
        "Assignment expressions are also supported. Let's set `inplace=True` to directly modify the `DataFrame` rather than getting a modified copy:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 106,
      "metadata": {
        "id": "6Jo-4YJNbdkh",
        "outputId": "98a48e15-2675-43af-da81-3a7a40399317",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 143
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         weight  height    hobby  age  over 30  pets  body_mass_index\n",
              "alice        68     172   Biking   33     True   NaN        22.985398\n",
              "bob          83     181  Dancing   34     True   0.0        25.335002\n",
              "charles     112     185      NaN   26    False   5.0        32.724617"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-2783160a-48b1-476e-8a91-749889714be1\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>weight</th>\n",
              "      <th>height</th>\n",
              "      <th>hobby</th>\n",
              "      <th>age</th>\n",
              "      <th>over 30</th>\n",
              "      <th>pets</th>\n",
              "      <th>body_mass_index</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>68</td>\n",
              "      <td>172</td>\n",
              "      <td>Biking</td>\n",
              "      <td>33</td>\n",
              "      <td>True</td>\n",
              "      <td>NaN</td>\n",
              "      <td>22.985398</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>83</td>\n",
              "      <td>181</td>\n",
              "      <td>Dancing</td>\n",
              "      <td>34</td>\n",
              "      <td>True</td>\n",
              "      <td>0.0</td>\n",
              "      <td>25.335002</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>112</td>\n",
              "      <td>185</td>\n",
              "      <td>NaN</td>\n",
              "      <td>26</td>\n",
              "      <td>False</td>\n",
              "      <td>5.0</td>\n",
              "      <td>32.724617</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-2783160a-48b1-476e-8a91-749889714be1')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-2783160a-48b1-476e-8a91-749889714be1 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-2783160a-48b1-476e-8a91-749889714be1');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 106
        }
      ],
      "source": [
        "people.eval(\"body_mass_index = weight / (height/100) ** 2\", inplace=True)\n",
        "people"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "0EhB5ch3bdkh"
      },
      "source": [
        "You can use a local or global variable in an expression by prefixing it with `'@'`:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 107,
      "metadata": {
        "id": "Df6YIkMRbdkh",
        "outputId": "8e2ff479-7929-43fc-f21d-4140bb5608da",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 143
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         weight  height    hobby  age  over 30  pets  body_mass_index  \\\n",
              "alice        68     172   Biking   33     True   NaN        22.985398   \n",
              "bob          83     181  Dancing   34     True   0.0        25.335002   \n",
              "charles     112     185      NaN   26    False   5.0        32.724617   \n",
              "\n",
              "         overweight  \n",
              "alice         False  \n",
              "bob           False  \n",
              "charles        True  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-52ae7301-96c8-489a-8985-1ec6c56ca68a\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>weight</th>\n",
              "      <th>height</th>\n",
              "      <th>hobby</th>\n",
              "      <th>age</th>\n",
              "      <th>over 30</th>\n",
              "      <th>pets</th>\n",
              "      <th>body_mass_index</th>\n",
              "      <th>overweight</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>68</td>\n",
              "      <td>172</td>\n",
              "      <td>Biking</td>\n",
              "      <td>33</td>\n",
              "      <td>True</td>\n",
              "      <td>NaN</td>\n",
              "      <td>22.985398</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>83</td>\n",
              "      <td>181</td>\n",
              "      <td>Dancing</td>\n",
              "      <td>34</td>\n",
              "      <td>True</td>\n",
              "      <td>0.0</td>\n",
              "      <td>25.335002</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>112</td>\n",
              "      <td>185</td>\n",
              "      <td>NaN</td>\n",
              "      <td>26</td>\n",
              "      <td>False</td>\n",
              "      <td>5.0</td>\n",
              "      <td>32.724617</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-52ae7301-96c8-489a-8985-1ec6c56ca68a')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-52ae7301-96c8-489a-8985-1ec6c56ca68a button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-52ae7301-96c8-489a-8985-1ec6c56ca68a');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 107
        }
      ],
      "source": [
        "overweight_threshold = 30\n",
        "people.eval(\"overweight = body_mass_index > @overweight_threshold\", inplace=True)\n",
        "people"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "8_UK9SF5bdkh"
      },
      "source": [
        "#### Querying a `DataFrame`\n",
        "The `query()` method lets you **filter a `DataFrame` based on a query expression**:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 108,
      "metadata": {
        "id": "hiVf_7cJbdkh",
        "outputId": "59659ea4-0a63-470a-e5db-4081ccfbbd01",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 81
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "     weight  height    hobby  age  over 30  pets  body_mass_index  overweight\n",
              "bob      83     181  Dancing   34     True   0.0        25.335002       False"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-c6ed813d-f054-4d1a-ae9a-e899796880aa\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>weight</th>\n",
              "      <th>height</th>\n",
              "      <th>hobby</th>\n",
              "      <th>age</th>\n",
              "      <th>over 30</th>\n",
              "      <th>pets</th>\n",
              "      <th>body_mass_index</th>\n",
              "      <th>overweight</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>83</td>\n",
              "      <td>181</td>\n",
              "      <td>Dancing</td>\n",
              "      <td>34</td>\n",
              "      <td>True</td>\n",
              "      <td>0.0</td>\n",
              "      <td>25.335002</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-c6ed813d-f054-4d1a-ae9a-e899796880aa')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-c6ed813d-f054-4d1a-ae9a-e899796880aa button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-c6ed813d-f054-4d1a-ae9a-e899796880aa');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 108
        }
      ],
      "source": [
        "people.query(\"age > 30 and pets == 0\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "zQYCm9izbdkh"
      },
      "source": [
        "#### Sorting a `DataFrame`\n",
        "You can sort a `DataFrame` by calling its `sort_index` method. By default it sorts the rows by their index label, in ascending order, but let's reverse the order:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 109,
      "metadata": {
        "id": "vTUlVF6ibdki",
        "outputId": "1e746bfe-56f2-400f-e72b-4c11fcfbcf63",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 143
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         weight  height    hobby  age  over 30  pets  body_mass_index  \\\n",
              "charles     112     185      NaN   26    False   5.0        32.724617   \n",
              "bob          83     181  Dancing   34     True   0.0        25.335002   \n",
              "alice        68     172   Biking   33     True   NaN        22.985398   \n",
              "\n",
              "         overweight  \n",
              "charles        True  \n",
              "bob           False  \n",
              "alice         False  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-e88db3c9-fd42-4687-8402-bab5670f3973\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>weight</th>\n",
              "      <th>height</th>\n",
              "      <th>hobby</th>\n",
              "      <th>age</th>\n",
              "      <th>over 30</th>\n",
              "      <th>pets</th>\n",
              "      <th>body_mass_index</th>\n",
              "      <th>overweight</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>112</td>\n",
              "      <td>185</td>\n",
              "      <td>NaN</td>\n",
              "      <td>26</td>\n",
              "      <td>False</td>\n",
              "      <td>5.0</td>\n",
              "      <td>32.724617</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>83</td>\n",
              "      <td>181</td>\n",
              "      <td>Dancing</td>\n",
              "      <td>34</td>\n",
              "      <td>True</td>\n",
              "      <td>0.0</td>\n",
              "      <td>25.335002</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>68</td>\n",
              "      <td>172</td>\n",
              "      <td>Biking</td>\n",
              "      <td>33</td>\n",
              "      <td>True</td>\n",
              "      <td>NaN</td>\n",
              "      <td>22.985398</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-e88db3c9-fd42-4687-8402-bab5670f3973')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-e88db3c9-fd42-4687-8402-bab5670f3973 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-e88db3c9-fd42-4687-8402-bab5670f3973');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 109
        }
      ],
      "source": [
        "people.sort_index(ascending=False)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hyUelvGXbdki"
      },
      "source": [
        "Note that `sort_index` returned a sorted *copy* of the `DataFrame`. To modify `people` directly, we can set the `inplace` argument to `True`. Also, we can sort the columns instead of the rows by setting `axis=1`:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 110,
      "metadata": {
        "id": "GTIgiGLTbdki",
        "outputId": "fea737f4-6f23-4df1-a7c0-2dac5842d163",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 143
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         age  body_mass_index  height    hobby  over 30  overweight  pets  \\\n",
              "alice     33        22.985398     172   Biking     True       False   NaN   \n",
              "bob       34        25.335002     181  Dancing     True       False   0.0   \n",
              "charles   26        32.724617     185      NaN    False        True   5.0   \n",
              "\n",
              "         weight  \n",
              "alice        68  \n",
              "bob          83  \n",
              "charles     112  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-fb3b3bbc-f3ab-4bbc-8586-79d20dd6624c\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>age</th>\n",
              "      <th>body_mass_index</th>\n",
              "      <th>height</th>\n",
              "      <th>hobby</th>\n",
              "      <th>over 30</th>\n",
              "      <th>overweight</th>\n",
              "      <th>pets</th>\n",
              "      <th>weight</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>33</td>\n",
              "      <td>22.985398</td>\n",
              "      <td>172</td>\n",
              "      <td>Biking</td>\n",
              "      <td>True</td>\n",
              "      <td>False</td>\n",
              "      <td>NaN</td>\n",
              "      <td>68</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>34</td>\n",
              "      <td>25.335002</td>\n",
              "      <td>181</td>\n",
              "      <td>Dancing</td>\n",
              "      <td>True</td>\n",
              "      <td>False</td>\n",
              "      <td>0.0</td>\n",
              "      <td>83</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>26</td>\n",
              "      <td>32.724617</td>\n",
              "      <td>185</td>\n",
              "      <td>NaN</td>\n",
              "      <td>False</td>\n",
              "      <td>True</td>\n",
              "      <td>5.0</td>\n",
              "      <td>112</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-fb3b3bbc-f3ab-4bbc-8586-79d20dd6624c')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-fb3b3bbc-f3ab-4bbc-8586-79d20dd6624c button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-fb3b3bbc-f3ab-4bbc-8586-79d20dd6624c');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 110
        }
      ],
      "source": [
        "people.sort_index(axis=1, inplace=True)\n",
        "people"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "2-4p9J5Dbdki"
      },
      "source": [
        "To sort the `DataFrame` by the values instead of the labels, we can use `sort_values` and specify the column to sort by:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 111,
      "metadata": {
        "id": "JHukwfIIbdki",
        "outputId": "57a23819-651b-49b6-b646-a6580bc37627",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 143
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         age  body_mass_index  height    hobby  over 30  overweight  pets  \\\n",
              "charles   26        32.724617     185      NaN    False        True   5.0   \n",
              "alice     33        22.985398     172   Biking     True       False   NaN   \n",
              "bob       34        25.335002     181  Dancing     True       False   0.0   \n",
              "\n",
              "         weight  \n",
              "charles     112  \n",
              "alice        68  \n",
              "bob          83  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-306e3fbd-d6a2-4628-9a73-20b21c276b12\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>age</th>\n",
              "      <th>body_mass_index</th>\n",
              "      <th>height</th>\n",
              "      <th>hobby</th>\n",
              "      <th>over 30</th>\n",
              "      <th>overweight</th>\n",
              "      <th>pets</th>\n",
              "      <th>weight</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>26</td>\n",
              "      <td>32.724617</td>\n",
              "      <td>185</td>\n",
              "      <td>NaN</td>\n",
              "      <td>False</td>\n",
              "      <td>True</td>\n",
              "      <td>5.0</td>\n",
              "      <td>112</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>33</td>\n",
              "      <td>22.985398</td>\n",
              "      <td>172</td>\n",
              "      <td>Biking</td>\n",
              "      <td>True</td>\n",
              "      <td>False</td>\n",
              "      <td>NaN</td>\n",
              "      <td>68</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>34</td>\n",
              "      <td>25.335002</td>\n",
              "      <td>181</td>\n",
              "      <td>Dancing</td>\n",
              "      <td>True</td>\n",
              "      <td>False</td>\n",
              "      <td>0.0</td>\n",
              "      <td>83</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-306e3fbd-d6a2-4628-9a73-20b21c276b12')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-306e3fbd-d6a2-4628-9a73-20b21c276b12 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-306e3fbd-d6a2-4628-9a73-20b21c276b12');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 111
        }
      ],
      "source": [
        "people.sort_values(by=\"age\", inplace=True)\n",
        "people"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "OBVCkgSGbdki"
      },
      "source": [
        "#### Plotting a `DataFrame`\n",
        "Just like for `Series`, pandas makes it easy to draw nice graphs based on a `DataFrame`.\n",
        "\n",
        "For example, it is trivial to create a line plot from a `DataFrame`'s data by calling its `plot` method:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 112,
      "metadata": {
        "id": "7gv0BV_ebdki",
        "outputId": "15c3bd59-5925-4811-f428-6320e47430b2",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 280
        }
      },
      "outputs": [
        {
          "output_type": "display_data",
          "data": {
            "text/plain": [
              "<Figure size 432x288 with 1 Axes>"
            ],
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAEHCAYAAABV4gY/AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nO3deXzV1Z3/8dcHshGysQSBLASVRZaAEJFFXKtVx4p23DsWV6bK1FGn07F2HtVpbYtK7cMZH1N/WpA6Y3VUbHW0zrSOWhUqGJQdF1SWsG8JS8h+fn+cb3IvMSF7bvLN+/l45MG95/u9954vyjvnnu9ZzDmHiIiES69YV0BERNqfwl1EJIQU7iIiIaRwFxEJIYW7iEgIxcW6AgADBw50eXl5sa6GiEi3smLFir3OucyGjnWJcM/Ly6OwsDDW1RAR6VbMbHNjx9QtIyISQgp3EZEQUriLiISQwl1EJIQU7iIiIaRwFxEJIYW7iEgIdYlx7iIiYVdWWc2B0gqKSys5UFpBSWklB0orGTU4lcnD+rX75yncRURaoLyqui6Yi0srOFBaScnRiuB58PhIJcVHfZDXhnl5VU2D7zfnzBMV7iIi7aWyuqYujIujwrq41AfzgdLKIMRrQ7qC4qOVlFZUN/qe8b2NjOQE+iXHk9Engdz+yeRnx5ORnEBGUNYvOZ705Hj6BWX9khM65PoU7iLSrVXXOEqO1mtFH6mk+GgkrA+UVlByNDqoKzlcXtXoe/buZWT0ifeBnJzA0IwkThmS5kM7ORLW/ZITSO8TT7++CWT0iSc5oTdm1olX3ziFu4h0CTU1jkNlVT6Aj0b3Sx/bcvYt6oq6lvbBssZDupdBep9IGGemJDJyUOoxLeeMZB/Mtc/Tk+NJTYzrMiHdWgp3EWlXzjkOlVcdE8y1LedjbyZWBK1rH9IlRyupOc6WzmlJcXUt5IzkBPIG9q1rOde1omtDOwjr1KQ4evXq3iHdWgp3EWmQc47SimrfWj5Sv1sj0k8duZlY219dSfVxUjolMS5oMfsAzsro89VWdN940oP+6YwgwHv30JBurSbD3cwWApcAu51z44KyicDjQBJQBdzunFtu/nvMo8DFQClwg3Puw46qvDSusrqGT3YeYnVRCauLipmYk8E1U3JjXS2JkcaG4UVGdNS7gXjUP66obniEB0ByQu+6VnRGcjyjB6cFLWd/47A2rKP7qdP7xBPfW9NrOkNzWu6LgMeAp6PKHgL+xTn3upldHDw/G7gIGBH8nA78KvhTOlBNjeOLvUdYXVTM6qISVhUVs377wbqhV2lJcQxKTYxxLaU9tPcwPIDEuF5RLed4TspMqWs5ZwRhHd2Krh3tkRjXuxOvXFqqyXB3zr1jZnn1i4G04HE6sD14PAt42jnngPfNLMPMhjjndrRTfXs85xzbio/WhfjqrSWs3VbCoeDOf5/43ozLSuNvpg4jPzudCdkZDBuQ3O1vDoVNRw7Dq+1vzmnmMLykeIV0GLW2z/1O4H/NbD5+CYPpQXkWsDXqvKKg7CvhbmZzgDkAubnqLmjM3sPlrC4qZtVW372yZlsJew9XAP4f8+jBaVw6cSgTsjPIz0nn5MwU4vS1t9NoGJ50Va0N99uAu5xzi83sKmAB8LWWvIFz7gngCYCCgoLj3CPvOQ6VVbJmW0ldP/mqrSVsKz4KgBmcnJnCWSMHMSEnnfzsDEYPTlWrq500dxjeMUPyjrR9GF56n2Nb0WEZhiex19pwnw38ffD4BeDXweNtQE7UedlBmdRTVlnN+h0HWb010k/+xd4juODXXE7/PkzMzWD29GHkZ2cwLiudlEQNbmpKQ8PwvtKKbuUwvEh/sx+GF30zUcPwpKtpbVpsB84C3gbOBT4Lyl8B/s7MnsPfSC1RfztUVdfw6a7DvjUetMo/2XmIqiBNBqYkMiE7nUsnZNW1yvv37Zgpyd2FhuGJtE1zhkI+ix8JM9DMioD7gFuBR80sDigj6DsH/oAfBrkRPxTyxg6oc5dWU+PYtO9I5IZnUQnrtpdQVhkZuZKfncGtZ57IhGwf5EPSk0L9Nbwlw/Cibya2ZRjesTcNIyM/NAxPeormjJa5tpFDkxs41wFz21qp7sI5x46SsmNa5KuLSjgU9MMmxfdi7NB0rp2Sy4TsDCbkZDCsf3K3/are2DC8Y1rRbRiGl94nnhMHptS7afjVYXhpfeJ1r0GkCerEbYH9Ryrqhh/WBvrew+UAxPUyRg1O5ZL8oXUt8pEndM2RK5XVNXUjPDQMTyScFO6NOFxexZqo1viqomKKDkRGrpw4sC9njhhIfnY6+TkZjBmS1ulBFctheJEZiD6kNQxPpGtRuOP7hDfsOHhMP/nnew7XjVzJyujDhJz0uolB47PSSU2Kb7fPb2oY3lduJjZjGJ4ZdX3S6X38MLwRg1Ijrei+GoYnEmY9Ltyrqmv4bPdXR65UVteOXEkgPzuDS/KH+IlB2ekMSGne1P3oYXi1rWYNwxORWAh1uDvn2LSv9JgZnuu2H+Rope87Tk2MY3x2OjefEYxcyclgaHoSQN0wvB0lZXy881DMhuGlJcV1yX57EenaQhXuO0vKWLm1uK6ffHVR8Ve6LganJTE+O53c/sn0S46n5Ggln+85zIebD7R6GN6owanH3EzUMDwRibVuHe7lVdU8tWQThZsOsLqomN2Hypt8zc6DZew8WMbyL/eTENeLflFrdGgYnoiERbcO9893H+GRP36Kw5GRnMCIQSlRLeeGh+HV3kzM6JNAnwSFtIiEU7cO9zFD01j3468T18s0wkNEJEq3DndA/dgiIg1QMoqIhJDCXUQkhBTuIiIhpHAXEQkhhbuISAgp3EVEQkjhLiISQgp3EZEQUriLiISQwl1EJIQU7iIiIaRwFxEJIYW7iEgIKdxFREKoyXA3s4VmttvM1tYr/66ZfWxm68zsoajyH5jZRjP7xMy+3hGVFhGR42vOeu6LgMeAp2sLzOwcYBYwwTlXbmaDgvIxwDXAWGAo8IaZjXTOVbd3xUVEpHFNttydc+8A++sV3wbMc86VB+fsDspnAc8558qdc18CG4Ep7VhfERFphtb2uY8EZprZMjP7s5mdFpRnAVujzisKykREpBO1dpu9OKA/MBU4DXjezE5syRuY2RxgDkBubm4rqyEiIg1pbcu9CHjJecuBGmAgsA3IiTovOyj7CufcE865AudcQWZmZiurISIiDWltuP8eOAfAzEYCCcBe4BXgGjNLNLPhwAhgeXtUVEREmq/JbhkzexY4GxhoZkXAfcBCYGEwPLICmO2cc8A6M3seWA9UAXM1UkZEpPOZz+TYKigocIWFhbGuhohIt2JmK5xzBQ0d0wxVEZEQUriLiISQwl1EJIQU7iIiIaRwFxEJIYW7iEgIKdxFREJI4S4iEkIKdxGREFK4i4iEkMJdRCSEFO4iIiGkcBcRCSGFu4hICCncRURCSOEuIhJCCncRkRBSuIuIhJDCXUQkhBTuIiIhpHAXEQkhhbuISAgp3EVEQkjhLiISQgp3EZEQajLczWyhme02s7UNHPsHM3NmNjB4bmb2r2a20cxWm9mkjqi0iIgcX3Na7ouAC+sXmlkOcAGwJar4ImBE8DMH+FXbqygiIi3VZLg7594B9jdw6JfA9wEXVTYLeNp57wMZZjakXWoqIiLN1qo+dzObBWxzzq2qdygL2Br1vCgoa+g95phZoZkV7tmzpzXVEBGRRrQ43M0sGbgX+FFbPtg594RzrsA5V5CZmdmWtxIRkXriWvGak4DhwCozA8gGPjSzKcA2ICfq3OygTEREOlGLW+7OuTXOuUHOuTznXB6+62WSc24n8Arw7WDUzFSgxDm3o32rLCIiTWnOUMhngb8Ao8ysyMxuPs7pfwC+ADYCTwK3t0stRUSkRZrslnHOXdvE8byoxw6Y2/ZqiYhIW2iGqohICCncRURCSOEuIhJCCncRkRBSuIuIhJDCXUQkhBTuIiIhpHAXEQkhhbuISAgp3EVEQkjhLiISQgp3EZEQUriLiISQwl1EJIQU7iIiIaRwFxEJIYW7iEgIKdxFREJI4S4iEkIKdxGREFK4i4iEkMJdRCSEFO4iIiGkcBcRCSGFu4hICDUZ7ma20Mx2m9naqLKHzexjM1ttZr8zs4yoYz8ws41m9omZfb2jKi4iIo1rTst9EXBhvbI/AeOcc/nAp8APAMxsDHANMDZ4zb+bWe92q62IiDRLk+HunHsH2F+v7I/Ouarg6ftAdvB4FvCcc67cOfclsBGY0o71FRGRZmiPPvebgNeDx1nA1qhjRUHZV5jZHDMrNLPCPXv2tEM1RESkVpvC3cx+CFQBz7T0tc65J5xzBc65gszMzLZUQ0RE6olr7QvN7AbgEuA855wLircBOVGnZQdlIiLSiVrVcjezC4HvA5c650qjDr0CXGNmiWY2HBgBLG97NUVEpCWabLmb2bPA2cBAMysC7sOPjkkE/mRmAO87577jnFtnZs8D6/HdNXOdc9UdVXkREWmYRXpUYqegoMAVFhbGuhoiIt2Kma1wzhU0dEwzVEVEQkjhLiISQgp3EZEQUriLiISQwl1EJIQU7iIiIaRwFxGJlfJDULq/6fNaodXLD4iISAsdPQBb3odN78HmpbBjFcy8G87953b/KIW7iEhHObwHtiyFTUt8mO9aCzjonQDZp/lgH3VRh3y0wl1EpL0c3O5DvLZlvvcTXx7XB3KmwDn3wrDpkFUA8UkdWhWFu4hIazgHxZuDMF8Cm5fAgS/9sYRUyJ0KE6+FYTNgyESIS+jU6incRUSawznY9zlsfi8S6AeL/LE+/SB3Oky51bfMB+dDr9juMKpwFxFpSE0N7PnYt8g3B33mh3f5Y30H+RDPu9P/mXkK9Opagw8V7iIiADXVsHNNJMg3L4WjwTDFtCwYfhbkzfDdLANOBr/ceZelcBeRnqm6EravjHSzbHkfyg/6Y/3yYNTFQet8BmQM6/JhXp/CXUR6hsoy2LYiaJW/B1uXQ2WwkdzAkTDur32rfNh0SM+KbV3bgcJdRMKp4ogP8M1LfVdLUSFUlwMGJ4yFU6/3QT5sBqRkxrq27U7hLiLhUFYCW5ZFboBu/whqqsB6wZAJwUiWGX6IYnL/WNe2wyncRaR7Kt0fufG5+T1/M9TVQK94yJoE0+/wYZ4zBZLSYl3bTqdwF5Hu4dCuqJEsS2D3el8el+Sn8p/5jz7Ms0+DhOTY1rULULiLSNdUUnTsVP59n/ny+L6QezqM+yYMO8O30uMSY1vXLkjhLiKx55yfuh89lb94sz+WmA7DpsGk632YD8mH3vGxrW83oHAXkc7nHOz91Id47YqJh7b7Y8kD/CiWqbf5bpYTxsZ8Kn93pHAXkY5XU+P7yDcviXSzlO71x1JO8CFeO/tz4KguN5W/O1K4i0j7q66CnauPncpfVuyPpefCyV+LhHn/E7vd7M/uoMlwN7OFwCXAbufcuKCsP/BfQB6wCbjKOXfAzAx4FLgYKAVucM592DFVF5Euo6rCjyuvm8q/DCoO+WP9T4JTvgF5Z/julozc2Na1h2hOy30R8BjwdFTZPcD/Oefmmdk9wfN/Ai4CRgQ/pwO/Cv4UkTCpPOpnfNZOGNr6AVQd9ccyT4H8q3zLPHc6pA2JbV17qCbD3Tn3jpnl1SueBZwdPP4N8DY+3GcBTzvnHPC+mWWY2RDn3I72qrCIxED5Ydi6LNLNsm0FVFcABoPHw+QbgjCfBn0Hxrq2Quv73E+ICuydwAnB4yxga9R5RUHZV8LdzOYAcwByc/U1TaRLOVrsV0ms7WbZvhJcNVhvGDoRTv+O72bJOR36ZMS6ttKANt9Qdc45M3OteN0TwBMABQUFLX69iLSjI/uibn6+BzujNnLOmgxn3OX7y3NOh8SUWNdWmqG14b6rtrvFzIYAu4PybUBO1HnZQZmIdCWHdkaGJG5e4nccgmAj59Pg7B/4MM8ugPg+sa2rtEprw/0VYDYwL/jz5ajyvzOz5/A3UkvU3y7SBRRvicz83LwE9n/hyxNS/VT+/Kv9sMShp3b6Rs7SMZozFPJZ/M3TgWZWBNyHD/XnzexmYDNwVXD6H/DDIDfih0Le2AF1FpHjcc6Hd3TLvCS4FZaU4VvkBTdHNnLurekuYdSc0TLXNnLovAbOdcDctlZKRFrAOd+tUhfmS+HwTn+sb6YP8el3+D8HjdHszx5Cv7JFupuaati1NrJi4pa/QOk+fyx1KAyfGewwdAYMHKHZnz2Uwl2kq6uuhB2rIotsbXkfykv8sYxhMPLCyHZx/fIU5gIo3EW6nqryYCPnIMy3LofKI/7YgBEw7vKojZyzY1tX6bIU7iKxVlEKRR9Ewrzog2AjZ2DQWJh4XWSRrZRBsa2rdBsKd5HOVnbQt8ZrZ39u+xBqKv1GzoPz4bRbIlP5e8BGztIxFO4iHa10fzCVP1jLfOfqYCPnOBg6CabNDabyT4Gk9FjXVkJC4S7S3g7vjowv37wUdq3DT+VP9Js3z/yeb5lnnwYJfWNdWwkphbtIW5Vsi6zJsnmp3z4OID7Zr8Vyzg99mA+dBPFJsa2r9BgKd5GWcM5v3Bw9lf/AJn8sMQ1yp8LEb/luliETtJGzxIzCXeR4nIN9G4+dyn8wWAuvTz8/gmXK3wZT+cdrI2fpMhTuItFqamDPhqiW+VI4Eix62ndQZEjisBmQOVpT+aXLUrhLz1ZdBbvWBGG+FLYshaMH/LG0bDjpnEiYDzhJsz+l21C4S89SVQE7Vka6Wba8H7WR84kw+q/8mizDpkO/YbGtq0gbKNwl3CrLYFthpJul6AOoLPXHBo6C/CsjU/nThsa2riLtSOEu4VJxJNjIeakP9G2FkY2cTxgHk77tgzx3OqRkxrq2Ih1G4S7dW1lJ1OzPJb7LpabKb+Q8ZAKc/re+ZZ471Y9uEekhFO7SvRzZ52961q5lvmttMJU/3m/kPOPvozZyTo11bUViRuEuXduhXZGZn5uW+GGKAHFJfvr+md+PTOXXRs4idRTu0rUUbz12Kv++jb48IcW3xsdf4Wd/Dj0V4hJjW1eRLkzhLrFTu5Fz3SJbS6B4iz+WlO5vek6a7VvmgydoI2eRFtC/Fuk8zsGeTyJBvnkpHNrhjyUP9H3lU+f6MB80RlP5e6jKykqKioooKyuLdVW6jKSkJLKzs4mPb/5aRQr3nqD8sB/b3dm7+NTURDZyru1mqdvIeUhkfHneGTBwpGZ/CgBFRUWkpqaSl5eH6f8JnHPs27ePoqIihg8f3uzXKdzDqrIMNv4J1rwIn/4vnPot+KtfdOxnVldFNnLevAS2/MUPVQTIyIURF/hAz5sB/YYrzKVBZWVlCvYoZsaAAQPYs2dPi16ncA+T6kr44s+w9kXY8KqfVp88EE79G5hwTft/XlU5bP8oMpV/6zKoOOyPDTgZxsyKTOXPyGn/z5fQUrAfqzV/H20KdzO7C7gFcMAa4EZgCPAcMABYAVzvnKtoy+fIcdTU+HHfa16E9S/D0f2QmA5jZ8G4v4a8M9vvRmRF6Ven8lcF/aKDxvhfILVdLamD2+czRaRVWv2v3syygDuAMc65o2b2PHANcDHwS+fcc2b2OHAz8Kt2qa14zvlNldcuhnUv+ZuS8ckw6iIYdwWcfF77DBMsP+Rb47UrJm5bEbWR83gouCmY/TkN+g5o++eJdBGbNm3ikksuYe3atc06//HHHyc5OZlvf/vbjZ6zaNEiCgsLeeyxx75y7Gc/+xn33ntvq+vbkLY26eKAPmZWCSQDO4BzgeuC478B7kfh3nbOwe71PtDXLva7//ROgJPPh/F/DSMvbPt+nEcP+Kn8td0sO1aBqw42cj4Vpt3uu1lyT9dGziJRvvOd77Tp9V0q3J1z28xsPrAFOAr8Ed8NU+ycqwpOKwKy2lzLnmzf55FA3/OxXzPlxLP8zMzRfwV9Mlr/3of3RE3lX+JHtuD8L43s02Dm3b5lnn0aJKa02yWJNNe//Pc61m8/2K7vOWZoGvd9Y2yT51VXV3PrrbeydOlSsrKyePnll9m+fTtz585lz549JCcn8+STTzJ69Gjuv/9+UlJS+N73vscHH3zAzTffTK9evTj//PN5/fXX674BbN++nQsvvJDPP/+cyy+/nIceeoh77rmHo0ePMnHiRMaOHcszzzzTLtfZlm6ZfsAsYDhQDLwAXNiC188B5gDk5ua2thrhVFIEa1/ygb5jpS/Lne5Hu5wyq/WrGR7cERnJsmkJ7P3El8f1gZwpcM69vr88q0AbOUuP99lnn/Hss8/y5JNPctVVV7F48WKeeuopHn/8cUaMGMGyZcu4/fbbefPNN4953Y033siTTz7JtGnTuOeee445tnLlSj766CMSExMZNWoU3/3ud5k3bx6PPfYYK1eubNf6t6Vb5mvAl865PQBm9hIwA8gws7ig9Z4NbGvoxc65J4AnAAoKClwb6hEOh/fA+t/7QN/yF1829FS44Kcw9nJIb+EXIOf8bM/oMD/wpT+WkBps5Hytb5kPmQhxCe17PSLtoDkt7I4yfPhwJk6cCMDkyZPZtGkTS5cu5corr6w7p7y8/JjXFBcXc+jQIaZNmwbAddddx6uvvlp3/LzzziM93Xdpjhkzhs2bN5OT0zEjydoS7luAqWaWjO+WOQ8oBN4CrsCPmJkNvNzWSobW0WL4+FU/0uXLP/vVDTNHwzn/DOO+6bd1ay7nfBdO9CJbB4v8sT79fMt/yq2+ZX7CeE3lF2lCYmJkUELv3r3ZtWsXGRkZbWph13/Pqqqq45zdNm3pc19mZi8CHwJVwEf4lvhrwHNm9kBQtqA9KhoaFUfgk9d9C33jG34jiX55cMbdfujiCWOa9z41Nb4PPnoq/+Fd/ljfzGCy0J0+zDNP0UbOIm2UlpbG8OHDeeGFF7jyyitxzrF69WomTJhQd05GRgapqaksW7aM008/neeee65Z7x0fH09lZWWLlhdoSpuab865+4D76hV/AUxpy/uGTlW5D/I1L8Kn/+OXAkgdClPm+Bb60ElNz9asqYada6IW2Vrqx7QDpGXB8LMiU/kHnKzZnyId4JlnnuG2227jgQceoLKykmuuueaYcAdYsGABt956K7169eKss86q64Y5njlz5pCfn8+kSZPa7YaqORf77u6CggJXWFgY62q0r+oq39WydrGfLVpeAn36w9jL/Fj03GnHb01XV8L2lVFT+d+H8mDUQL+8yMzPvBmQMUxhLqGxYcMGTjnllFhXo9UOHz5MSoofXTZv3jx27NjBo48+2ub3bejvxcxWOOcKGjpfHa/tqaYGtr4fmS1auhcS02D0JX4s+vCzoHcjX7sqy/wkodpFtrYuj9rIeaTvsqmd/dnSm6si0mlee+01fv7zn1NVVcWwYcNYtGhRTOqhcG8r5/z6KmsX++GLh7b7oYWjLgxmi36t4WGFFUd8gNd2sxQVQnU5fiPnsXDq9T7Ih83QRs4i3cjVV1/N1VdfHetqKNxbbfeGyOSi/V/4PTxP/hpc8BM/W7T+pJ+yg5GNnDcv8b8Qaqr8VP4hE4KRLMFGzsn9Y3NNIhIaCveW2P9FZHLR7vU+mIefCWfcBad8ww85rFW6349Xr11ka+fqqI2cJ8H07/p+85wpkJQWu2sSkVBSuDfl4HZY9zvfj779Q1+WMxUuetjfHK3dAOPwbn9e7SJbu9f58rqNnP8xMpU/ITk21yIiPYbCvSFH9gazRV/yQY3zXSfn/xjGftOvTV5SBF+8HVlka99n/rXxff3CWuMu9y3zrEnayFlEOp3CvVZZCXz8mm+hf/G2Xw1x4Eg4+wd+pEqv3r575a2fBRs5b/avS0yHYdNg0vU+zIfkNz4iRkRC7ZZbbuHuu+9mzJjGJyPecMMNXHLJJVxxxRXHlNcub3Ddddc18sqW6dnhXlHqJxWtXQyf/dHPFs3I9f3hQ/L98gBb/gK/+YYfBQOQPCDYyPk2381ywlht5CwiAPz6179u9Ws3bdrEb3/7W4V7q1WVw+dv+hb6J69D5RHoOwhyTvebNlcdhY/+E5bs9eennBDZ93PYDBg4SlP5RTrL6/f4mdntafB4uGjecU95+OGHSUxM5I477uCuu+5i1apVvPnmm7z55pssWLCA2bNnc99991FeXs5JJ53EU089RUpKCmeffTbz58+noKCABQsW8OCDD5KRkcGECRNITEys26jjnXfe4ZFHHmHnzp089NBDXHHFFdxzzz1s2LCBiRMnMnv2bO666642XWbPCPfqKtj0brC36H9HNm0Gv0JiVbk/DpCe64c01oZ5/xM1+1Okh5k5cya/+MUvuOOOOygsLKS8vJzKykreffdd8vPzeeCBB3jjjTfo27cvDz74II888gg/+tGP6l6/fft2fvKTn/Dhhx+SmprKueeee8wyBTt27OC9997j448/5tJLL+WKK65g3rx5zJ8//5hVJNsivOFeUwNFy4PZor+HI43sHJ4yKLImy7DpvltGRLqGJlrYHWXy5MmsWLGCgwcPkpiYyKRJkygsLOTdd9/l0ksvZf369cyYMQOAioqKuiV+ay1fvpyzzjqL/v39nJUrr7ySTz/9tO74ZZddRq9evRgzZgy7du3qkGsIV7g757eGW/sirP1dZMnbaJmjI90sudMhbUjn11NEurT4+HiGDx/OokWLmD59Ovn5+bz11lts3LiR4cOHc/755/Pss8+2+v2jl/7tqPW9whHuez7xLfS1i2H/51EHzPev1YX5NOg7MGbVFJHuY+bMmcyfP5+FCxcyfvx47r77biZPnszUqVOZO3cuGzdu5OSTT+bIkSNs27aNkSNH1r32tNNO48477+TAgQOkpqayePFixo8ff9zPS01N5dChQ+1W/+4d7gd3wG+v8rM/we8vmjU5CPMz/E3StuwxKiI91syZM/npT3/KtGnT6Nu3L0lJScycOZPMzEwWLVrEtddeW7cT0wMPPHBMuGdlZXHvvfcyZcoU+vfvz+jRo5tc+jc/P5/evXszYcIEbrjhhjbfUO3eS/4e3A6v/5Mfjz5sug9zbeQs0q119yV/a9Uu/VtVVcXll1/OTTfdxOWXX97q9+tZS/6mDYWr/yPWtRAR+Yr777+fN954gzH7heoAAAgFSURBVLKyMi644AIuu+yyTv387h3uIiJd1Pz582P6+ZqNIyJdTlfoLu5KWvP3oXAXkS4lKSmJffv2KeADzjn27dtHUlIDm/4ch7plRKRLyc7OpqioiD17Gpl42AMlJSWRnZ3dotco3EWkS6mdQCRto24ZEZEQUriLiISQwl1EJIS6xAxVM9sDbI51PVppILA31pWIkZ567T31uqHnXntXve5hzrnMhg50iXDvzsyssLHpv2HXU6+9p1439Nxr747XrW4ZEZEQUriLiISQwr3tnoh1BWKop157T71u6LnX3u2uW33uIiIhpJa7iEgIKdxFREJI4d4CZpZjZm+Z2XozW2dmf1/v+D+YmTOzUG3UerzrNrPvmtnHQflDsaxnR2js2s1sopm9b2YrzazQzKbEuq7tycySzGy5ma0KrvtfgvLhZrbMzDaa2X+ZWUKs69rejnPtz5jZJ2a21swWmll8rOt6XM45/TTzBxgCTAoepwKfAmOC5znA/+InYw2MdV0747qBc4A3gMTg2KBY17UTr/2PwEVB+cXA27GuaztftwEpweN4YBkwFXgeuCYofxy4LdZ17cRrvzg4ZsCzXf3a1XJvAefcDufch8HjQ8AGICs4/Evg+0Do7lAf57pvA+Y558qDY7tjV8uOcZxrd0BacFo6sD02NewYzjscPI0PfhxwLvBiUP4boHP3jusEjV27c+4PwTEHLAdatgZvJ1O4t5KZ5QGnAsvMbBawzTm3KqaV6gTR1w2MBGYGX9P/bGanxbJuHa3etd8JPGxmW4H5wA9iV7OOYWa9zWwlsBv4E/A5UOycqwpOKSLSuAmV+tfunFsWdSweuB74n1jVrzkU7q1gZinAYvw/8CrgXuBHMa1UJ4i+bufcQfx+AP3xX1n/EXjezCyGVewwDVz7bcBdzrkc4C5gQSzr1xGcc9XOuYn4FuoUYHSMq9Rp6l+7mY2LOvzvwDvOuXdjU7vmUbi3UPBbezHwjHPuJeAkYDiwysw24f9n+NDMBseulu2vgesG33J7KfimuhyowS+wFCqNXPtsoPbxC/jwCyXnXDHwFjANyDCz2k1+soFtMatYJ4i69gsBzOw+IBO4O5b1ag6FewsErdIFwAbn3CMAzrk1zrlBzrk851wePvAmOed2xrCq7aqh6w78Hn9TFTMbCSTQNVfOa7XjXPt24Kzg8bnAZ51dt45kZplmlhE87gOcj7/f8BZwRXDabODl2NSw4zRy7R+b2S3A14FrnXM1saxjc2iGaguY2RnAu8AafCsV4F7n3B+iztkEFDjnQhNyjV03fqTMQmAiUAF8zzn3Zkwq2UGOc+0HgUfxXVNlwO3OuRUxqWQHMLN8/A3T3vhG4PPOuR+b2YnAc/juuI+Av6m9oR4Wx7n2KvxouEPBqS85534co2o2SeEuIhJC6pYREQkhhbuISAgp3EVEQkjhLiISQgp3EZEQUriLiISQwl06nZnlmdnaVr72bDN7tb3r1JHMrMDM/rWFr7nfzL7XUXWS8Itr+hQRaQvnXCFQGOt6SM+ilrvESlyw+cEGM3vRzJLN7Dwz+8jM1gSbISQCmNmFwYYgHwLfDMp6mdlnZpYZ9Xxj7fP6zGyRmf0q2GDji+AbwMLg8xdFnferYPONuk0agvJ5wYYdq81sflB2ZbBxwyoze6exC43+thG0yBea2dtBPe6IOu+HZvapmb0HjIoqP8nM/sfMVpjZu2Y22szizOwDMzs7OOfnZvbTlv9nkNCK9YLy+ul5P0Aefm3wGcHzhcA/A1uBkUHZ0/hVN5OC8hH4TRKeB14NzrkPv0ojwAXA4uN85iL8tHkDZuGXDxiPb+CsACYG5/UP/uwNvA3kAwOAT4jM6M4I/lwDZEWXNfLZZ0fV+X5gKZCIX2RtH3698MnB+yXj14nfiF/OAeD/gBHB49OBN4PHY/HrvXwNvxRAQqz/2+qn6/yo5S6xstU5tyR4/J/AecCXzrlPg7LfAGfil5n90jn3mXPOBefWWgh8O3h8E/BUE5/538F7rAF2Ob/oWw2wDv8LB+Cq4BvCR/jwHAOU4NePWWBm3wRKg3OXAIvM7Fb8L4Pmes05V+78+kO7gROAmcDvnHOlzi8p/ArULTU8HXghWF/8/+F3h8I5tw74D+BV4CbnXEUL6iAhp3CXWKm/qFFxi9/Aua3ALjM7F7/k7utNvKR2gauaqMe1z+PMbDjwPeA851w+8BqQ5PzmFFPwOxBdQrBJg3PuO/hvHDnACjMb0MyqR392Nce/99ULv0HGxKifU6KOj8f/3Q1q5mdLD6Fwl1jJNbNpwePr8Dcc88zs5KDseuDPwMdB+UlB+bX13ufX+Nb8C8656jbWKQ04ApSY2QnARVDXek53fvXPu4AJQflJzrllzrkfAXvwId9a7wCXmVkfM0sFvgEQtOK/NLMrg880M6v9/G/iV2c8E/i32mVqRUDhLrHzCTDXzDYA/fB70N6I736oXV73cedcGTAHeC3oLqm/T+srQApNd8k0yfltEj/C/0L5Lb7bBfzG2K+a2WrgPSIbNTwc3Pxdi+9Hb/U2i87v0/pfwXu8DnwQdfhbwM1mtgrfhTTLzAYC84Bbgq6sx/BLEIsAWvJXujkzKwB+6ZybGeu6iHQlGucu3ZaZ3YPfy/Rbsa6LSFejlruEipn9ELiyXvELzrkOHwNuZl8HHqxX/KVz7vKO/myR+hTuIiIhpBuqIiIhpHAXEQkhhbuISAgp3EVEQuj/A6y/m6qL6lQYAAAAAElFTkSuQmCC\n"
          },
          "metadata": {
            "needs_background": "light"
          }
        }
      ],
      "source": [
        "people.plot(kind = \"line\", x = \"body_mass_index\", y = [\"height\", \"weight\"])\n",
        "plt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "l0DjpOZJbdkj"
      },
      "source": [
        "Again, there are way too many options to list here: the best option is to scroll through the [Visualization](http://pandas.pydata.org/pandas-docs/stable/visualization.html) page in pandas' documentation, find the plot you are interested in and look at the example code."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6m0s18-Hbdkj"
      },
      "source": [
        "#### Operations on `DataFrame`s\n",
        "Although `DataFrame`s do not try to mimick NumPy arrays, there are a few similarities. Let's create a `DataFrame` to demonstrate this:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 113,
      "metadata": {
        "id": "-D1z8SV2bdkj",
        "outputId": "ed63ff4a-091a-4273-f8ec-0a5590d5f5e1",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 175
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         sep  oct  nov\n",
              "alice      8    8    9\n",
              "bob       10    9    9\n",
              "charles    4    8    2\n",
              "darwin     9   10   10"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-b4304c68-72b1-4f26-9891-4c8c7c8533ce\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>sep</th>\n",
              "      <th>oct</th>\n",
              "      <th>nov</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>8</td>\n",
              "      <td>8</td>\n",
              "      <td>9</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>10</td>\n",
              "      <td>9</td>\n",
              "      <td>9</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>4</td>\n",
              "      <td>8</td>\n",
              "      <td>2</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>darwin</th>\n",
              "      <td>9</td>\n",
              "      <td>10</td>\n",
              "      <td>10</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-b4304c68-72b1-4f26-9891-4c8c7c8533ce')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-b4304c68-72b1-4f26-9891-4c8c7c8533ce button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-b4304c68-72b1-4f26-9891-4c8c7c8533ce');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 113
        }
      ],
      "source": [
        "grades_array = np.array([[8,8,9],[10,9,9],[4, 8, 2], [9, 10, 10]])\n",
        "grades = pd.DataFrame(grades_array, columns=[\"sep\", \"oct\", \"nov\"], index=[\"alice\",\"bob\",\"charles\",\"darwin\"])\n",
        "grades"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "YNYV5RC2bdkj"
      },
      "source": [
        "You can apply NumPy mathematical functions on a `DataFrame`: the function is applied to all values:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 114,
      "metadata": {
        "id": "tVS8LYkqbdkj",
        "outputId": "38d63ab6-a283-4e63-f5e0-fc75db2711fa",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 175
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "              sep       oct       nov\n",
              "alice    2.828427  2.828427  3.000000\n",
              "bob      3.162278  3.000000  3.000000\n",
              "charles  2.000000  2.828427  1.414214\n",
              "darwin   3.000000  3.162278  3.162278"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-89d73db3-19ff-4207-8a3d-1e3c5ac079ff\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>sep</th>\n",
              "      <th>oct</th>\n",
              "      <th>nov</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>2.828427</td>\n",
              "      <td>2.828427</td>\n",
              "      <td>3.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>3.162278</td>\n",
              "      <td>3.000000</td>\n",
              "      <td>3.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>2.000000</td>\n",
              "      <td>2.828427</td>\n",
              "      <td>1.414214</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>darwin</th>\n",
              "      <td>3.000000</td>\n",
              "      <td>3.162278</td>\n",
              "      <td>3.162278</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-89d73db3-19ff-4207-8a3d-1e3c5ac079ff')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-89d73db3-19ff-4207-8a3d-1e3c5ac079ff button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-89d73db3-19ff-4207-8a3d-1e3c5ac079ff');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 114
        }
      ],
      "source": [
        "np.sqrt(grades)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 115,
      "metadata": {
        "id": "H9ZQAzESbdkj",
        "outputId": "fcfd6dba-2682-4801-c339-c0392be70cd2",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 175
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         sep  oct  nov\n",
              "alice      9    9   10\n",
              "bob       11   10   10\n",
              "charles    5    9    3\n",
              "darwin    10   11   11"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-ba556d47-0832-4ad2-95b1-2cb01ba9ecf8\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>sep</th>\n",
              "      <th>oct</th>\n",
              "      <th>nov</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>9</td>\n",
              "      <td>9</td>\n",
              "      <td>10</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>11</td>\n",
              "      <td>10</td>\n",
              "      <td>10</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>5</td>\n",
              "      <td>9</td>\n",
              "      <td>3</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>darwin</th>\n",
              "      <td>10</td>\n",
              "      <td>11</td>\n",
              "      <td>11</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-ba556d47-0832-4ad2-95b1-2cb01ba9ecf8')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-ba556d47-0832-4ad2-95b1-2cb01ba9ecf8 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-ba556d47-0832-4ad2-95b1-2cb01ba9ecf8');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 115
        }
      ],
      "source": [
        "grades + 1"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "tUIPt4C-bdkk"
      },
      "source": [
        "Aggregation operations, such as computing the `max`, the `sum` or the `mean` of a `DataFrame`, apply to each column, and you get back a `Series` object:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 116,
      "metadata": {
        "id": "1C8lhBIXbdkk",
        "outputId": "d05f160a-9996-4ffa-883e-0658a5fa34d7",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "sep    7.75\n",
              "oct    8.75\n",
              "nov    7.50\n",
              "dtype: float64"
            ]
          },
          "metadata": {},
          "execution_count": 116
        }
      ],
      "source": [
        "grades.mean()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qWRNPxVGbdkl"
      },
      "source": [
        "Most of these functions take an optional `axis` parameter which lets you specify along which axis of the `DataFrame` you want the operation executed. The default is `axis=0`, meaning that the operation is executed vertically (on each column). You can set `axis=1` to execute the operation horizontally (on each row). For example, let's find out which students had all grades greater than `5`:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 117,
      "metadata": {
        "id": "HoyAlWC6bdkl",
        "outputId": "afee0091-1564-40c4-aff7-3e43d167ac8a",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "alice       True\n",
              "bob         True\n",
              "charles    False\n",
              "darwin      True\n",
              "dtype: bool"
            ]
          },
          "metadata": {},
          "execution_count": 117
        }
      ],
      "source": [
        "(grades > 5).all(axis = 1)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mdIwFwAgbdkl"
      },
      "source": [
        "If you add a `Series` object to a `DataFrame` (or execute any other binary operation), pandas attempts to broadcast the operation to all *rows* in the `DataFrame`. This only works if the `Series` has the same size as the `DataFrame`s rows. For example, let's subtract the `mean` of the `DataFrame` (a `Series` object) from the `DataFrame`:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 118,
      "metadata": {
        "id": "sUzEkkg1bdkl",
        "outputId": "2c56256b-9c5e-4306-e47b-03b1457c099c",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 175
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "          sep   oct  nov\n",
              "alice    0.25 -0.75  1.5\n",
              "bob      2.25  0.25  1.5\n",
              "charles -3.75 -0.75 -5.5\n",
              "darwin   1.25  1.25  2.5"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-ed28d65b-a808-48b5-993e-c8f31b41dbdc\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>sep</th>\n",
              "      <th>oct</th>\n",
              "      <th>nov</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>0.25</td>\n",
              "      <td>-0.75</td>\n",
              "      <td>1.5</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>2.25</td>\n",
              "      <td>0.25</td>\n",
              "      <td>1.5</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>-3.75</td>\n",
              "      <td>-0.75</td>\n",
              "      <td>-5.5</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>darwin</th>\n",
              "      <td>1.25</td>\n",
              "      <td>1.25</td>\n",
              "      <td>2.5</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-ed28d65b-a808-48b5-993e-c8f31b41dbdc')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-ed28d65b-a808-48b5-993e-c8f31b41dbdc button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-ed28d65b-a808-48b5-993e-c8f31b41dbdc');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 118
        }
      ],
      "source": [
        "grades - grades.mean()  # equivalent to: grades - [7.75, 8.75, 7.50]"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 119,
      "metadata": {
        "id": "xOU7mlmVbdkm",
        "outputId": "e6207be5-2db1-48fe-9b22-f0fb34aaca76",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 175
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "          sep   oct  nov\n",
              "alice    7.75  8.75  7.5\n",
              "bob      7.75  8.75  7.5\n",
              "charles  7.75  8.75  7.5\n",
              "darwin   7.75  8.75  7.5"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-bb10d6ec-9e35-4e06-ae44-b3b06ff330b9\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>sep</th>\n",
              "      <th>oct</th>\n",
              "      <th>nov</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>7.75</td>\n",
              "      <td>8.75</td>\n",
              "      <td>7.5</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>7.75</td>\n",
              "      <td>8.75</td>\n",
              "      <td>7.5</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>7.75</td>\n",
              "      <td>8.75</td>\n",
              "      <td>7.5</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>darwin</th>\n",
              "      <td>7.75</td>\n",
              "      <td>8.75</td>\n",
              "      <td>7.5</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-bb10d6ec-9e35-4e06-ae44-b3b06ff330b9')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-bb10d6ec-9e35-4e06-ae44-b3b06ff330b9 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-bb10d6ec-9e35-4e06-ae44-b3b06ff330b9');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 119
        }
      ],
      "source": [
        "# We subtracted `7.75` from all September grades, `8.75` from October grades and `7.50` \n",
        "# from November grades. It is equivalent to subtracting this `DataFrame`:\n",
        "pd.DataFrame([[7.75, 8.75, 7.50]]*4, index=grades.index, columns=grades.columns)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "cc1Lz1SSbdkm"
      },
      "source": [
        "If you want to subtract the global mean from every grade, here is one way to do it:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 120,
      "metadata": {
        "scrolled": true,
        "id": "nCJUsodMbdkm",
        "outputId": "18fdedb1-d8e4-409b-8edb-ae749e8be8e3",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 175
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         sep  oct  nov\n",
              "alice    0.0  0.0  1.0\n",
              "bob      2.0  1.0  1.0\n",
              "charles -4.0  0.0 -6.0\n",
              "darwin   1.0  2.0  2.0"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-95a52617-808b-4ad0-9fc1-2c61466e57c5\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>sep</th>\n",
              "      <th>oct</th>\n",
              "      <th>nov</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>1.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>2.0</td>\n",
              "      <td>1.0</td>\n",
              "      <td>1.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>-4.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>-6.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>darwin</th>\n",
              "      <td>1.0</td>\n",
              "      <td>2.0</td>\n",
              "      <td>2.0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-95a52617-808b-4ad0-9fc1-2c61466e57c5')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-95a52617-808b-4ad0-9fc1-2c61466e57c5 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-95a52617-808b-4ad0-9fc1-2c61466e57c5');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 120
        }
      ],
      "source": [
        "grades - grades.values.mean() # subtracts the global mean (8.00) from all grades"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 121,
      "metadata": {
        "id": "Qeykr6KIbdkm",
        "outputId": "0861bc4b-7f51-41c6-87fd-5a60b4acdc8a",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 175
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         oct  nov  dec\n",
              "bob      0.0  NaN  2.0\n",
              "colin    NaN  1.0  0.0\n",
              "darwin   0.0  1.0  0.0\n",
              "charles  3.0  3.0  0.0"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-f3e77c39-ab74-47e6-ab54-c0a505b1c30d\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>oct</th>\n",
              "      <th>nov</th>\n",
              "      <th>dec</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>0.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>2.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>colin</th>\n",
              "      <td>NaN</td>\n",
              "      <td>1.0</td>\n",
              "      <td>0.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>darwin</th>\n",
              "      <td>0.0</td>\n",
              "      <td>1.0</td>\n",
              "      <td>0.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>3.0</td>\n",
              "      <td>3.0</td>\n",
              "      <td>0.0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-f3e77c39-ab74-47e6-ab54-c0a505b1c30d')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-f3e77c39-ab74-47e6-ab54-c0a505b1c30d button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-f3e77c39-ab74-47e6-ab54-c0a505b1c30d');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 121
        }
      ],
      "source": [
        "bonus_array = np.array([[0,np.nan,2],[np.nan,1,0],[0, 1, 0], [3, 3, 0]])\n",
        "bonus_points = pd.DataFrame(bonus_array, columns=[\"oct\", \"nov\", \"dec\"], index=[\"bob\",\"colin\", \"darwin\", \"charles\"])\n",
        "bonus_points"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 122,
      "metadata": {
        "scrolled": true,
        "id": "PmBlxZ2bbdkm",
        "outputId": "5c6c880c-eb65-41cd-e1ab-66a41604fc3d",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 206
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         dec   nov   oct  sep\n",
              "alice    NaN   NaN   NaN  NaN\n",
              "bob      NaN   NaN   9.0  NaN\n",
              "charles  NaN   5.0  11.0  NaN\n",
              "colin    NaN   NaN   NaN  NaN\n",
              "darwin   NaN  11.0  10.0  NaN"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-e2b10ceb-8b20-4972-aafc-4d8d1e35d718\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>dec</th>\n",
              "      <th>nov</th>\n",
              "      <th>oct</th>\n",
              "      <th>sep</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>9.0</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>NaN</td>\n",
              "      <td>5.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>colin</th>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>darwin</th>\n",
              "      <td>NaN</td>\n",
              "      <td>11.0</td>\n",
              "      <td>10.0</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-e2b10ceb-8b20-4972-aafc-4d8d1e35d718')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-e2b10ceb-8b20-4972-aafc-4d8d1e35d718 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-e2b10ceb-8b20-4972-aafc-4d8d1e35d718');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 122
        }
      ],
      "source": [
        "grades + bonus_points"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ZEtUuQNHbdkm"
      },
      "source": [
        "#### Handling missing data\n",
        "Dealing with missing data is a frequent task when working with real life data. Pandas offers a few tools to handle missing data.\n",
        " \n",
        "Let's try to fix the problem above. For example, we can decide that missing data should result in a zero, instead of `NaN`. We can replace all `NaN` values by a any value using the `fillna()` method:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 123,
      "metadata": {
        "scrolled": true,
        "id": "89vMid5vbdkm",
        "outputId": "4f8a9ba1-42e1-43f9-a1da-e118325c7ea4",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 206
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         dec   nov   oct  sep\n",
              "alice    0.0   0.0   0.0  0.0\n",
              "bob      0.0   0.0   9.0  0.0\n",
              "charles  0.0   5.0  11.0  0.0\n",
              "colin    0.0   0.0   0.0  0.0\n",
              "darwin   0.0  11.0  10.0  0.0"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-79fe7551-e4ad-4218-be9a-df0d0454a98b\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>dec</th>\n",
              "      <th>nov</th>\n",
              "      <th>oct</th>\n",
              "      <th>sep</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>9.0</td>\n",
              "      <td>0.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>0.0</td>\n",
              "      <td>5.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>0.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>colin</th>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>darwin</th>\n",
              "      <td>0.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>10.0</td>\n",
              "      <td>0.0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-79fe7551-e4ad-4218-be9a-df0d0454a98b')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-79fe7551-e4ad-4218-be9a-df0d0454a98b button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-79fe7551-e4ad-4218-be9a-df0d0454a98b');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 123
        }
      ],
      "source": [
        "(grades + bonus_points).fillna(0)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 124,
      "metadata": {
        "scrolled": true,
        "id": "Nb9szHYrbdko",
        "outputId": "a8d1cfde-0587-49a0-def0-222278a6caac",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 206
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         dec   nov   oct  sep\n",
              "alice    NaN   NaN   NaN  NaN\n",
              "bob      NaN   NaN   9.0  NaN\n",
              "charles  NaN   5.0  11.0  NaN\n",
              "colin    NaN   NaN   NaN  NaN\n",
              "darwin   NaN  11.0  10.0  NaN"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-4a2657e7-c87b-4c01-b6b8-4582a4ac1eb2\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>dec</th>\n",
              "      <th>nov</th>\n",
              "      <th>oct</th>\n",
              "      <th>sep</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>9.0</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>NaN</td>\n",
              "      <td>5.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>colin</th>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>darwin</th>\n",
              "      <td>NaN</td>\n",
              "      <td>11.0</td>\n",
              "      <td>10.0</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-4a2657e7-c87b-4c01-b6b8-4582a4ac1eb2')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-4a2657e7-c87b-4c01-b6b8-4582a4ac1eb2 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-4a2657e7-c87b-4c01-b6b8-4582a4ac1eb2');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 124
        }
      ],
      "source": [
        "final_grades = grades + bonus_points\n",
        "final_grades"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "84Ov3x5Ubdko"
      },
      "source": [
        "We can call the `dropna()` method to get rid of rows that are full of `NaN`s:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 125,
      "metadata": {
        "id": "ACmmOPyrbdko",
        "outputId": "6ee6fbfc-d4a4-4ef4-de42-2e131edeca8d",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 143
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         dec   nov   oct  sep\n",
              "bob      NaN   NaN   9.0  NaN\n",
              "charles  NaN   5.0  11.0  NaN\n",
              "darwin   NaN  11.0  10.0  NaN"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-2b59cc0e-b0ec-4d4d-8152-ab36b8076d06\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>dec</th>\n",
              "      <th>nov</th>\n",
              "      <th>oct</th>\n",
              "      <th>sep</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>9.0</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>NaN</td>\n",
              "      <td>5.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>darwin</th>\n",
              "      <td>NaN</td>\n",
              "      <td>11.0</td>\n",
              "      <td>10.0</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-2b59cc0e-b0ec-4d4d-8152-ab36b8076d06')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-2b59cc0e-b0ec-4d4d-8152-ab36b8076d06 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-2b59cc0e-b0ec-4d4d-8152-ab36b8076d06');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 125
        }
      ],
      "source": [
        "final_grades_clean = final_grades.dropna(how=\"all\")\n",
        "final_grades_clean"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "F_ft7zzcbdko"
      },
      "source": [
        "Now let's remove columns that are full of `NaN`s by setting the `axis` argument to `1`:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 126,
      "metadata": {
        "id": "DeqnrLEnbdko",
        "outputId": "d957894d-f4d0-4dcb-c788-48ccf66a5394",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 143
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "          nov   oct\n",
              "bob       NaN   9.0\n",
              "charles   5.0  11.0\n",
              "darwin   11.0  10.0"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-4142b8eb-51bb-42e3-a328-bbe4069745dd\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>nov</th>\n",
              "      <th>oct</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>NaN</td>\n",
              "      <td>9.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>5.0</td>\n",
              "      <td>11.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>darwin</th>\n",
              "      <td>11.0</td>\n",
              "      <td>10.0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-4142b8eb-51bb-42e3-a328-bbe4069745dd')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-4142b8eb-51bb-42e3-a328-bbe4069745dd button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-4142b8eb-51bb-42e3-a328-bbe4069745dd');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 126
        }
      ],
      "source": [
        "final_grades_clean = final_grades_clean.dropna(axis=1, how=\"all\")\n",
        "final_grades_clean"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "NbyYrlrzbdko"
      },
      "source": [
        "#### Aggregating with `groupby`\n",
        "Similar to the SQL language, pandas allows grouping your data into groups to run calculations over each group.\n",
        "\n",
        "First, let's add some extra data about each person so we can group them, and let's go back to the `final_grades` `DataFrame` so we can see how `NaN` values are handled:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 127,
      "metadata": {
        "scrolled": true,
        "id": "II2IdemTbdkp",
        "outputId": "c78cf8f1-a24c-416a-a4b7-5a7188b2c34d",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 206
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         dec   nov   oct  sep    hobby\n",
              "alice    NaN   NaN   NaN  NaN   Biking\n",
              "bob      NaN   NaN   9.0  NaN  Dancing\n",
              "charles  NaN   5.0  11.0  NaN      NaN\n",
              "colin    NaN   NaN   NaN  NaN  Dancing\n",
              "darwin   NaN  11.0  10.0  NaN   Biking"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-69041dbd-6bc1-47dc-9a7b-7d3dba77a93e\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>dec</th>\n",
              "      <th>nov</th>\n",
              "      <th>oct</th>\n",
              "      <th>sep</th>\n",
              "      <th>hobby</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>Biking</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>9.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>Dancing</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>charles</th>\n",
              "      <td>NaN</td>\n",
              "      <td>5.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>colin</th>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>Dancing</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>darwin</th>\n",
              "      <td>NaN</td>\n",
              "      <td>11.0</td>\n",
              "      <td>10.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>Biking</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-69041dbd-6bc1-47dc-9a7b-7d3dba77a93e')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-69041dbd-6bc1-47dc-9a7b-7d3dba77a93e button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-69041dbd-6bc1-47dc-9a7b-7d3dba77a93e');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 127
        }
      ],
      "source": [
        "final_grades[\"hobby\"] = [\"Biking\", \"Dancing\", np.nan, \"Dancing\", \"Biking\"]\n",
        "final_grades"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "2suQBYhGbdkp"
      },
      "source": [
        "Now let's group data in this `DataFrame` by hobby:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 128,
      "metadata": {
        "id": "dOhALCgkbdkp",
        "outputId": "a9170462-2a5a-4d52-a2e5-1c0b873e5195",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<pandas.core.groupby.generic.DataFrameGroupBy object at 0x7eff70a61750>"
            ]
          },
          "metadata": {},
          "execution_count": 128
        }
      ],
      "source": [
        "grouped_grades = final_grades.groupby(\"hobby\")\n",
        "grouped_grades"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DZAFvTlTbdkp"
      },
      "source": [
        "We are ready to compute the average grade per hobby:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 129,
      "metadata": {
        "id": "o5Gs6lcLbdkp",
        "outputId": "992a9833-c604-45be-8893-c6cfbf97e791",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 143
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         dec   nov   oct  sep\n",
              "hobby                        \n",
              "Biking   NaN  11.0  10.0  NaN\n",
              "Dancing  NaN   NaN   9.0  NaN"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-e19569fd-d1fb-44b7-b6be-2e83975b714a\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>dec</th>\n",
              "      <th>nov</th>\n",
              "      <th>oct</th>\n",
              "      <th>sep</th>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>hobby</th>\n",
              "      <th></th>\n",
              "      <th></th>\n",
              "      <th></th>\n",
              "      <th></th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>Biking</th>\n",
              "      <td>NaN</td>\n",
              "      <td>11.0</td>\n",
              "      <td>10.0</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>Dancing</th>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>9.0</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-e19569fd-d1fb-44b7-b6be-2e83975b714a')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-e19569fd-d1fb-44b7-b6be-2e83975b714a button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-e19569fd-d1fb-44b7-b6be-2e83975b714a');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 129
        }
      ],
      "source": [
        "grouped_grades.mean()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "pgOZUDdRbdkp"
      },
      "source": [
        "That was easy! Note that the `NaN` values have simply been skipped when computing the means."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "PB0GOgYLbdkp"
      },
      "source": [
        "#### Pivot tables\n",
        "Pandas supports spreadsheet-like [pivot tables](https://en.wikipedia.org/wiki/Pivot_table) that allow quick data summarization."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "rK5gDiX6bdkr"
      },
      "source": [
        "#### Overview functions\n",
        "When dealing with large `DataFrames`, it is useful to get a quick overview of its content. Pandas offers a few functions for this. First, let's create a large `DataFrame` with a mix of numeric values, missing values and text values. Notice how Jupyter displays only the corners of the `DataFrame`:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 130,
      "metadata": {
        "id": "6u57h9MQbdkr",
        "outputId": "f86cea34-090e-4e85-c6c7-e02d3f1f07ad",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 424
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         A     B     C some_text      D      E      F     G      H      I  \\\n",
              "0      NaN  11.0  44.0    Blabla   99.0    NaN   88.0  22.0  165.0  143.0   \n",
              "1     11.0  22.0  55.0    Blabla  110.0    NaN   99.0  33.0    NaN  154.0   \n",
              "2     22.0  33.0  66.0    Blabla  121.0   11.0  110.0  44.0    NaN  165.0   \n",
              "3     33.0  44.0  77.0    Blabla  132.0   22.0  121.0  55.0   11.0    NaN   \n",
              "4     44.0  55.0  88.0    Blabla  143.0   33.0  132.0  66.0   22.0    NaN   \n",
              "...    ...   ...   ...       ...    ...    ...    ...   ...    ...    ...   \n",
              "9995   NaN   NaN  33.0    Blabla   88.0  165.0   77.0  11.0  154.0  132.0   \n",
              "9996   NaN  11.0  44.0    Blabla   99.0    NaN   88.0  22.0  165.0  143.0   \n",
              "9997  11.0  22.0  55.0    Blabla  110.0    NaN   99.0  33.0    NaN  154.0   \n",
              "9998  22.0  33.0  66.0    Blabla  121.0   11.0  110.0  44.0    NaN  165.0   \n",
              "9999  33.0  44.0  77.0    Blabla  132.0   22.0  121.0  55.0   11.0    NaN   \n",
              "\n",
              "      ...     Q     R     S     T      U      V      W     X      Y      Z  \n",
              "0     ...  11.0   NaN  11.0  44.0   99.0    NaN   88.0  22.0  165.0  143.0  \n",
              "1     ...  22.0  11.0  22.0  55.0  110.0    NaN   99.0  33.0    NaN  154.0  \n",
              "2     ...  33.0  22.0  33.0  66.0  121.0   11.0  110.0  44.0    NaN  165.0  \n",
              "3     ...  44.0  33.0  44.0  77.0  132.0   22.0  121.0  55.0   11.0    NaN  \n",
              "4     ...  55.0  44.0  55.0  88.0  143.0   33.0  132.0  66.0   22.0    NaN  \n",
              "...   ...   ...   ...   ...   ...    ...    ...    ...   ...    ...    ...  \n",
              "9995  ...   NaN   NaN   NaN  33.0   88.0  165.0   77.0  11.0  154.0  132.0  \n",
              "9996  ...  11.0   NaN  11.0  44.0   99.0    NaN   88.0  22.0  165.0  143.0  \n",
              "9997  ...  22.0  11.0  22.0  55.0  110.0    NaN   99.0  33.0    NaN  154.0  \n",
              "9998  ...  33.0  22.0  33.0  66.0  121.0   11.0  110.0  44.0    NaN  165.0  \n",
              "9999  ...  44.0  33.0  44.0  77.0  132.0   22.0  121.0  55.0   11.0    NaN  \n",
              "\n",
              "[10000 rows x 27 columns]"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-4a03dc73-5230-46b7-9bd5-02d3c95e5f0c\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>A</th>\n",
              "      <th>B</th>\n",
              "      <th>C</th>\n",
              "      <th>some_text</th>\n",
              "      <th>D</th>\n",
              "      <th>E</th>\n",
              "      <th>F</th>\n",
              "      <th>G</th>\n",
              "      <th>H</th>\n",
              "      <th>I</th>\n",
              "      <th>...</th>\n",
              "      <th>Q</th>\n",
              "      <th>R</th>\n",
              "      <th>S</th>\n",
              "      <th>T</th>\n",
              "      <th>U</th>\n",
              "      <th>V</th>\n",
              "      <th>W</th>\n",
              "      <th>X</th>\n",
              "      <th>Y</th>\n",
              "      <th>Z</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>NaN</td>\n",
              "      <td>11.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>Blabla</td>\n",
              "      <td>99.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>88.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>165.0</td>\n",
              "      <td>143.0</td>\n",
              "      <td>...</td>\n",
              "      <td>11.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>11.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>99.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>88.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>165.0</td>\n",
              "      <td>143.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>11.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>55.0</td>\n",
              "      <td>Blabla</td>\n",
              "      <td>110.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>99.0</td>\n",
              "      <td>33.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>154.0</td>\n",
              "      <td>...</td>\n",
              "      <td>22.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>55.0</td>\n",
              "      <td>110.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>99.0</td>\n",
              "      <td>33.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>154.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>22.0</td>\n",
              "      <td>33.0</td>\n",
              "      <td>66.0</td>\n",
              "      <td>Blabla</td>\n",
              "      <td>121.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>110.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>165.0</td>\n",
              "      <td>...</td>\n",
              "      <td>33.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>33.0</td>\n",
              "      <td>66.0</td>\n",
              "      <td>121.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>110.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>165.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>33.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>77.0</td>\n",
              "      <td>Blabla</td>\n",
              "      <td>132.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>121.0</td>\n",
              "      <td>55.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>...</td>\n",
              "      <td>44.0</td>\n",
              "      <td>33.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>77.0</td>\n",
              "      <td>132.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>121.0</td>\n",
              "      <td>55.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>44.0</td>\n",
              "      <td>55.0</td>\n",
              "      <td>88.0</td>\n",
              "      <td>Blabla</td>\n",
              "      <td>143.0</td>\n",
              "      <td>33.0</td>\n",
              "      <td>132.0</td>\n",
              "      <td>66.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>...</td>\n",
              "      <td>55.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>55.0</td>\n",
              "      <td>88.0</td>\n",
              "      <td>143.0</td>\n",
              "      <td>33.0</td>\n",
              "      <td>132.0</td>\n",
              "      <td>66.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>...</th>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>9995</th>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>33.0</td>\n",
              "      <td>Blabla</td>\n",
              "      <td>88.0</td>\n",
              "      <td>165.0</td>\n",
              "      <td>77.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>154.0</td>\n",
              "      <td>132.0</td>\n",
              "      <td>...</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>33.0</td>\n",
              "      <td>88.0</td>\n",
              "      <td>165.0</td>\n",
              "      <td>77.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>154.0</td>\n",
              "      <td>132.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>9996</th>\n",
              "      <td>NaN</td>\n",
              "      <td>11.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>Blabla</td>\n",
              "      <td>99.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>88.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>165.0</td>\n",
              "      <td>143.0</td>\n",
              "      <td>...</td>\n",
              "      <td>11.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>11.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>99.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>88.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>165.0</td>\n",
              "      <td>143.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>9997</th>\n",
              "      <td>11.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>55.0</td>\n",
              "      <td>Blabla</td>\n",
              "      <td>110.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>99.0</td>\n",
              "      <td>33.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>154.0</td>\n",
              "      <td>...</td>\n",
              "      <td>22.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>55.0</td>\n",
              "      <td>110.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>99.0</td>\n",
              "      <td>33.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>154.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>9998</th>\n",
              "      <td>22.0</td>\n",
              "      <td>33.0</td>\n",
              "      <td>66.0</td>\n",
              "      <td>Blabla</td>\n",
              "      <td>121.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>110.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>165.0</td>\n",
              "      <td>...</td>\n",
              "      <td>33.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>33.0</td>\n",
              "      <td>66.0</td>\n",
              "      <td>121.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>110.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>165.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>9999</th>\n",
              "      <td>33.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>77.0</td>\n",
              "      <td>Blabla</td>\n",
              "      <td>132.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>121.0</td>\n",
              "      <td>55.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>...</td>\n",
              "      <td>44.0</td>\n",
              "      <td>33.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>77.0</td>\n",
              "      <td>132.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>121.0</td>\n",
              "      <td>55.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>10000 rows × 27 columns</p>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-4a03dc73-5230-46b7-9bd5-02d3c95e5f0c')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-4a03dc73-5230-46b7-9bd5-02d3c95e5f0c button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-4a03dc73-5230-46b7-9bd5-02d3c95e5f0c');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 130
        }
      ],
      "source": [
        "much_data = np.fromfunction(lambda x,y: (x+y*y)%17*11, (10000, 26))\n",
        "large_df = pd.DataFrame(much_data, columns=list(\"ABCDEFGHIJKLMNOPQRSTUVWXYZ\"))\n",
        "large_df[large_df % 16 == 0] = np.nan\n",
        "large_df.insert(3,\"some_text\", \"Blabla\")\n",
        "large_df"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qwsVdaFNbdkr"
      },
      "source": [
        "The `head()` method returns the top 5 rows:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 131,
      "metadata": {
        "id": "ZD-qCazAbdkr",
        "outputId": "8b563e33-1328-48a0-80eb-0baf3373c098",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 236
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "      A     B     C some_text      D     E      F     G      H      I  ...  \\\n",
              "0   NaN  11.0  44.0    Blabla   99.0   NaN   88.0  22.0  165.0  143.0  ...   \n",
              "1  11.0  22.0  55.0    Blabla  110.0   NaN   99.0  33.0    NaN  154.0  ...   \n",
              "2  22.0  33.0  66.0    Blabla  121.0  11.0  110.0  44.0    NaN  165.0  ...   \n",
              "3  33.0  44.0  77.0    Blabla  132.0  22.0  121.0  55.0   11.0    NaN  ...   \n",
              "4  44.0  55.0  88.0    Blabla  143.0  33.0  132.0  66.0   22.0    NaN  ...   \n",
              "\n",
              "      Q     R     S     T      U     V      W     X      Y      Z  \n",
              "0  11.0   NaN  11.0  44.0   99.0   NaN   88.0  22.0  165.0  143.0  \n",
              "1  22.0  11.0  22.0  55.0  110.0   NaN   99.0  33.0    NaN  154.0  \n",
              "2  33.0  22.0  33.0  66.0  121.0  11.0  110.0  44.0    NaN  165.0  \n",
              "3  44.0  33.0  44.0  77.0  132.0  22.0  121.0  55.0   11.0    NaN  \n",
              "4  55.0  44.0  55.0  88.0  143.0  33.0  132.0  66.0   22.0    NaN  \n",
              "\n",
              "[5 rows x 27 columns]"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-49d96a22-c389-456b-893f-201bcd51e4c6\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>A</th>\n",
              "      <th>B</th>\n",
              "      <th>C</th>\n",
              "      <th>some_text</th>\n",
              "      <th>D</th>\n",
              "      <th>E</th>\n",
              "      <th>F</th>\n",
              "      <th>G</th>\n",
              "      <th>H</th>\n",
              "      <th>I</th>\n",
              "      <th>...</th>\n",
              "      <th>Q</th>\n",
              "      <th>R</th>\n",
              "      <th>S</th>\n",
              "      <th>T</th>\n",
              "      <th>U</th>\n",
              "      <th>V</th>\n",
              "      <th>W</th>\n",
              "      <th>X</th>\n",
              "      <th>Y</th>\n",
              "      <th>Z</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>NaN</td>\n",
              "      <td>11.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>Blabla</td>\n",
              "      <td>99.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>88.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>165.0</td>\n",
              "      <td>143.0</td>\n",
              "      <td>...</td>\n",
              "      <td>11.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>11.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>99.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>88.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>165.0</td>\n",
              "      <td>143.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>11.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>55.0</td>\n",
              "      <td>Blabla</td>\n",
              "      <td>110.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>99.0</td>\n",
              "      <td>33.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>154.0</td>\n",
              "      <td>...</td>\n",
              "      <td>22.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>55.0</td>\n",
              "      <td>110.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>99.0</td>\n",
              "      <td>33.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>154.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>22.0</td>\n",
              "      <td>33.0</td>\n",
              "      <td>66.0</td>\n",
              "      <td>Blabla</td>\n",
              "      <td>121.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>110.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>165.0</td>\n",
              "      <td>...</td>\n",
              "      <td>33.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>33.0</td>\n",
              "      <td>66.0</td>\n",
              "      <td>121.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>110.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>165.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>33.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>77.0</td>\n",
              "      <td>Blabla</td>\n",
              "      <td>132.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>121.0</td>\n",
              "      <td>55.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>...</td>\n",
              "      <td>44.0</td>\n",
              "      <td>33.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>77.0</td>\n",
              "      <td>132.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>121.0</td>\n",
              "      <td>55.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>44.0</td>\n",
              "      <td>55.0</td>\n",
              "      <td>88.0</td>\n",
              "      <td>Blabla</td>\n",
              "      <td>143.0</td>\n",
              "      <td>33.0</td>\n",
              "      <td>132.0</td>\n",
              "      <td>66.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>...</td>\n",
              "      <td>55.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>55.0</td>\n",
              "      <td>88.0</td>\n",
              "      <td>143.0</td>\n",
              "      <td>33.0</td>\n",
              "      <td>132.0</td>\n",
              "      <td>66.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>5 rows × 27 columns</p>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-49d96a22-c389-456b-893f-201bcd51e4c6')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-49d96a22-c389-456b-893f-201bcd51e4c6 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-49d96a22-c389-456b-893f-201bcd51e4c6');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 131
        }
      ],
      "source": [
        "large_df.head()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "An6ZPfSvbdkr"
      },
      "source": [
        "Of course there's also a `tail()` function to view the bottom 5 rows. You can pass the number of rows you want:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 132,
      "metadata": {
        "id": "a4I2ghvbbdkr",
        "outputId": "e02cb78f-9f24-42c5-ff15-8eb7056f690f",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 141
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         A     B     C some_text      D     E      F     G     H      I  ...  \\\n",
              "9998  22.0  33.0  66.0    Blabla  121.0  11.0  110.0  44.0   NaN  165.0  ...   \n",
              "9999  33.0  44.0  77.0    Blabla  132.0  22.0  121.0  55.0  11.0    NaN  ...   \n",
              "\n",
              "         Q     R     S     T      U     V      W     X     Y      Z  \n",
              "9998  33.0  22.0  33.0  66.0  121.0  11.0  110.0  44.0   NaN  165.0  \n",
              "9999  44.0  33.0  44.0  77.0  132.0  22.0  121.0  55.0  11.0    NaN  \n",
              "\n",
              "[2 rows x 27 columns]"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-38a713c1-f860-4fbe-885a-c3c87516d5f1\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>A</th>\n",
              "      <th>B</th>\n",
              "      <th>C</th>\n",
              "      <th>some_text</th>\n",
              "      <th>D</th>\n",
              "      <th>E</th>\n",
              "      <th>F</th>\n",
              "      <th>G</th>\n",
              "      <th>H</th>\n",
              "      <th>I</th>\n",
              "      <th>...</th>\n",
              "      <th>Q</th>\n",
              "      <th>R</th>\n",
              "      <th>S</th>\n",
              "      <th>T</th>\n",
              "      <th>U</th>\n",
              "      <th>V</th>\n",
              "      <th>W</th>\n",
              "      <th>X</th>\n",
              "      <th>Y</th>\n",
              "      <th>Z</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>9998</th>\n",
              "      <td>22.0</td>\n",
              "      <td>33.0</td>\n",
              "      <td>66.0</td>\n",
              "      <td>Blabla</td>\n",
              "      <td>121.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>110.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>165.0</td>\n",
              "      <td>...</td>\n",
              "      <td>33.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>33.0</td>\n",
              "      <td>66.0</td>\n",
              "      <td>121.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>110.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>165.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>9999</th>\n",
              "      <td>33.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>77.0</td>\n",
              "      <td>Blabla</td>\n",
              "      <td>132.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>121.0</td>\n",
              "      <td>55.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>NaN</td>\n",
              "      <td>...</td>\n",
              "      <td>44.0</td>\n",
              "      <td>33.0</td>\n",
              "      <td>44.0</td>\n",
              "      <td>77.0</td>\n",
              "      <td>132.0</td>\n",
              "      <td>22.0</td>\n",
              "      <td>121.0</td>\n",
              "      <td>55.0</td>\n",
              "      <td>11.0</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>2 rows × 27 columns</p>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-38a713c1-f860-4fbe-885a-c3c87516d5f1')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-38a713c1-f860-4fbe-885a-c3c87516d5f1 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-38a713c1-f860-4fbe-885a-c3c87516d5f1');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 132
        }
      ],
      "source": [
        "large_df.tail(n=2)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "WtzFculCbdkr"
      },
      "source": [
        "The `info()` method prints out a summary of each columns contents:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 133,
      "metadata": {
        "id": "m0kK-Undbdkr",
        "outputId": "d2b9328d-0542-4815-c628-019e01dfc7e3",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "<class 'pandas.core.frame.DataFrame'>\n",
            "RangeIndex: 10000 entries, 0 to 9999\n",
            "Data columns (total 27 columns):\n",
            " #   Column     Non-Null Count  Dtype  \n",
            "---  ------     --------------  -----  \n",
            " 0   A          8823 non-null   float64\n",
            " 1   B          8824 non-null   float64\n",
            " 2   C          8824 non-null   float64\n",
            " 3   some_text  10000 non-null  object \n",
            " 4   D          8824 non-null   float64\n",
            " 5   E          8822 non-null   float64\n",
            " 6   F          8824 non-null   float64\n",
            " 7   G          8824 non-null   float64\n",
            " 8   H          8822 non-null   float64\n",
            " 9   I          8823 non-null   float64\n",
            " 10  J          8823 non-null   float64\n",
            " 11  K          8822 non-null   float64\n",
            " 12  L          8824 non-null   float64\n",
            " 13  M          8824 non-null   float64\n",
            " 14  N          8822 non-null   float64\n",
            " 15  O          8824 non-null   float64\n",
            " 16  P          8824 non-null   float64\n",
            " 17  Q          8824 non-null   float64\n",
            " 18  R          8823 non-null   float64\n",
            " 19  S          8824 non-null   float64\n",
            " 20  T          8824 non-null   float64\n",
            " 21  U          8824 non-null   float64\n",
            " 22  V          8822 non-null   float64\n",
            " 23  W          8824 non-null   float64\n",
            " 24  X          8824 non-null   float64\n",
            " 25  Y          8822 non-null   float64\n",
            " 26  Z          8823 non-null   float64\n",
            "dtypes: float64(26), object(1)\n",
            "memory usage: 2.1+ MB\n"
          ]
        }
      ],
      "source": [
        "large_df.info()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-93LVb6xbdks"
      },
      "source": [
        "Finally, the `describe()` method gives a nice overview of the main aggregated values over each column:\n",
        "* `count`: number of non-null (not NaN) values\n",
        "* `mean`: mean of non-null values\n",
        "* `std`: [standard deviation](https://en.wikipedia.org/wiki/Standard_deviation) of non-null values\n",
        "* `min`: minimum of non-null values\n",
        "* `25%`, `50%`, `75%`: 25th, 50th and 75th [percentile](https://en.wikipedia.org/wiki/Percentile) of non-null values\n",
        "* `max`: maximum of non-null values"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 134,
      "metadata": {
        "id": "DqG1O-2Cbdks",
        "outputId": "5f62fedb-b288-46bc-ab3c-400708640e8f",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 394
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "                 A            B            C            D            E  \\\n",
              "count  8823.000000  8824.000000  8824.000000  8824.000000  8822.000000   \n",
              "mean     87.977559    87.972575    87.987534    88.012466    87.983791   \n",
              "std      47.535911    47.535523    47.521679    47.521679    47.535001   \n",
              "min      11.000000    11.000000    11.000000    11.000000    11.000000   \n",
              "25%      44.000000    44.000000    44.000000    44.000000    44.000000   \n",
              "50%      88.000000    88.000000    88.000000    88.000000    88.000000   \n",
              "75%     132.000000   132.000000   132.000000   132.000000   132.000000   \n",
              "max     165.000000   165.000000   165.000000   165.000000   165.000000   \n",
              "\n",
              "                 F            G            H            I            J  ...  \\\n",
              "count  8824.000000  8824.000000  8822.000000  8823.000000  8823.000000  ...   \n",
              "mean     88.007480    87.977561    88.000000    88.022441    88.022441  ...   \n",
              "std      47.519371    47.529755    47.536879    47.535911    47.535911  ...   \n",
              "min      11.000000    11.000000    11.000000    11.000000    11.000000  ...   \n",
              "25%      44.000000    44.000000    44.000000    44.000000    44.000000  ...   \n",
              "50%      88.000000    88.000000    88.000000    88.000000    88.000000  ...   \n",
              "75%     132.000000   132.000000   132.000000   132.000000   132.000000  ...   \n",
              "max     165.000000   165.000000   165.000000   165.000000   165.000000  ...   \n",
              "\n",
              "                 Q            R            S            T            U  \\\n",
              "count  8824.000000  8823.000000  8824.000000  8824.000000  8824.000000   \n",
              "mean     87.972575    87.977559    87.972575    87.987534    88.012466   \n",
              "std      47.535523    47.535911    47.535523    47.521679    47.521679   \n",
              "min      11.000000    11.000000    11.000000    11.000000    11.000000   \n",
              "25%      44.000000    44.000000    44.000000    44.000000    44.000000   \n",
              "50%      88.000000    88.000000    88.000000    88.000000    88.000000   \n",
              "75%     132.000000   132.000000   132.000000   132.000000   132.000000   \n",
              "max     165.000000   165.000000   165.000000   165.000000   165.000000   \n",
              "\n",
              "                 V            W            X            Y            Z  \n",
              "count  8822.000000  8824.000000  8824.000000  8822.000000  8823.000000  \n",
              "mean     87.983791    88.007480    87.977561    88.000000    88.022441  \n",
              "std      47.535001    47.519371    47.529755    47.536879    47.535911  \n",
              "min      11.000000    11.000000    11.000000    11.000000    11.000000  \n",
              "25%      44.000000    44.000000    44.000000    44.000000    44.000000  \n",
              "50%      88.000000    88.000000    88.000000    88.000000    88.000000  \n",
              "75%     132.000000   132.000000   132.000000   132.000000   132.000000  \n",
              "max     165.000000   165.000000   165.000000   165.000000   165.000000  \n",
              "\n",
              "[8 rows x 26 columns]"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-6d79f9e9-c378-47e2-9b72-fb4005b9d2ad\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>A</th>\n",
              "      <th>B</th>\n",
              "      <th>C</th>\n",
              "      <th>D</th>\n",
              "      <th>E</th>\n",
              "      <th>F</th>\n",
              "      <th>G</th>\n",
              "      <th>H</th>\n",
              "      <th>I</th>\n",
              "      <th>J</th>\n",
              "      <th>...</th>\n",
              "      <th>Q</th>\n",
              "      <th>R</th>\n",
              "      <th>S</th>\n",
              "      <th>T</th>\n",
              "      <th>U</th>\n",
              "      <th>V</th>\n",
              "      <th>W</th>\n",
              "      <th>X</th>\n",
              "      <th>Y</th>\n",
              "      <th>Z</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>count</th>\n",
              "      <td>8823.000000</td>\n",
              "      <td>8824.000000</td>\n",
              "      <td>8824.000000</td>\n",
              "      <td>8824.000000</td>\n",
              "      <td>8822.000000</td>\n",
              "      <td>8824.000000</td>\n",
              "      <td>8824.000000</td>\n",
              "      <td>8822.000000</td>\n",
              "      <td>8823.000000</td>\n",
              "      <td>8823.000000</td>\n",
              "      <td>...</td>\n",
              "      <td>8824.000000</td>\n",
              "      <td>8823.000000</td>\n",
              "      <td>8824.000000</td>\n",
              "      <td>8824.000000</td>\n",
              "      <td>8824.000000</td>\n",
              "      <td>8822.000000</td>\n",
              "      <td>8824.000000</td>\n",
              "      <td>8824.000000</td>\n",
              "      <td>8822.000000</td>\n",
              "      <td>8823.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>mean</th>\n",
              "      <td>87.977559</td>\n",
              "      <td>87.972575</td>\n",
              "      <td>87.987534</td>\n",
              "      <td>88.012466</td>\n",
              "      <td>87.983791</td>\n",
              "      <td>88.007480</td>\n",
              "      <td>87.977561</td>\n",
              "      <td>88.000000</td>\n",
              "      <td>88.022441</td>\n",
              "      <td>88.022441</td>\n",
              "      <td>...</td>\n",
              "      <td>87.972575</td>\n",
              "      <td>87.977559</td>\n",
              "      <td>87.972575</td>\n",
              "      <td>87.987534</td>\n",
              "      <td>88.012466</td>\n",
              "      <td>87.983791</td>\n",
              "      <td>88.007480</td>\n",
              "      <td>87.977561</td>\n",
              "      <td>88.000000</td>\n",
              "      <td>88.022441</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>std</th>\n",
              "      <td>47.535911</td>\n",
              "      <td>47.535523</td>\n",
              "      <td>47.521679</td>\n",
              "      <td>47.521679</td>\n",
              "      <td>47.535001</td>\n",
              "      <td>47.519371</td>\n",
              "      <td>47.529755</td>\n",
              "      <td>47.536879</td>\n",
              "      <td>47.535911</td>\n",
              "      <td>47.535911</td>\n",
              "      <td>...</td>\n",
              "      <td>47.535523</td>\n",
              "      <td>47.535911</td>\n",
              "      <td>47.535523</td>\n",
              "      <td>47.521679</td>\n",
              "      <td>47.521679</td>\n",
              "      <td>47.535001</td>\n",
              "      <td>47.519371</td>\n",
              "      <td>47.529755</td>\n",
              "      <td>47.536879</td>\n",
              "      <td>47.535911</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>min</th>\n",
              "      <td>11.000000</td>\n",
              "      <td>11.000000</td>\n",
              "      <td>11.000000</td>\n",
              "      <td>11.000000</td>\n",
              "      <td>11.000000</td>\n",
              "      <td>11.000000</td>\n",
              "      <td>11.000000</td>\n",
              "      <td>11.000000</td>\n",
              "      <td>11.000000</td>\n",
              "      <td>11.000000</td>\n",
              "      <td>...</td>\n",
              "      <td>11.000000</td>\n",
              "      <td>11.000000</td>\n",
              "      <td>11.000000</td>\n",
              "      <td>11.000000</td>\n",
              "      <td>11.000000</td>\n",
              "      <td>11.000000</td>\n",
              "      <td>11.000000</td>\n",
              "      <td>11.000000</td>\n",
              "      <td>11.000000</td>\n",
              "      <td>11.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>25%</th>\n",
              "      <td>44.000000</td>\n",
              "      <td>44.000000</td>\n",
              "      <td>44.000000</td>\n",
              "      <td>44.000000</td>\n",
              "      <td>44.000000</td>\n",
              "      <td>44.000000</td>\n",
              "      <td>44.000000</td>\n",
              "      <td>44.000000</td>\n",
              "      <td>44.000000</td>\n",
              "      <td>44.000000</td>\n",
              "      <td>...</td>\n",
              "      <td>44.000000</td>\n",
              "      <td>44.000000</td>\n",
              "      <td>44.000000</td>\n",
              "      <td>44.000000</td>\n",
              "      <td>44.000000</td>\n",
              "      <td>44.000000</td>\n",
              "      <td>44.000000</td>\n",
              "      <td>44.000000</td>\n",
              "      <td>44.000000</td>\n",
              "      <td>44.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>50%</th>\n",
              "      <td>88.000000</td>\n",
              "      <td>88.000000</td>\n",
              "      <td>88.000000</td>\n",
              "      <td>88.000000</td>\n",
              "      <td>88.000000</td>\n",
              "      <td>88.000000</td>\n",
              "      <td>88.000000</td>\n",
              "      <td>88.000000</td>\n",
              "      <td>88.000000</td>\n",
              "      <td>88.000000</td>\n",
              "      <td>...</td>\n",
              "      <td>88.000000</td>\n",
              "      <td>88.000000</td>\n",
              "      <td>88.000000</td>\n",
              "      <td>88.000000</td>\n",
              "      <td>88.000000</td>\n",
              "      <td>88.000000</td>\n",
              "      <td>88.000000</td>\n",
              "      <td>88.000000</td>\n",
              "      <td>88.000000</td>\n",
              "      <td>88.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>75%</th>\n",
              "      <td>132.000000</td>\n",
              "      <td>132.000000</td>\n",
              "      <td>132.000000</td>\n",
              "      <td>132.000000</td>\n",
              "      <td>132.000000</td>\n",
              "      <td>132.000000</td>\n",
              "      <td>132.000000</td>\n",
              "      <td>132.000000</td>\n",
              "      <td>132.000000</td>\n",
              "      <td>132.000000</td>\n",
              "      <td>...</td>\n",
              "      <td>132.000000</td>\n",
              "      <td>132.000000</td>\n",
              "      <td>132.000000</td>\n",
              "      <td>132.000000</td>\n",
              "      <td>132.000000</td>\n",
              "      <td>132.000000</td>\n",
              "      <td>132.000000</td>\n",
              "      <td>132.000000</td>\n",
              "      <td>132.000000</td>\n",
              "      <td>132.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>max</th>\n",
              "      <td>165.000000</td>\n",
              "      <td>165.000000</td>\n",
              "      <td>165.000000</td>\n",
              "      <td>165.000000</td>\n",
              "      <td>165.000000</td>\n",
              "      <td>165.000000</td>\n",
              "      <td>165.000000</td>\n",
              "      <td>165.000000</td>\n",
              "      <td>165.000000</td>\n",
              "      <td>165.000000</td>\n",
              "      <td>...</td>\n",
              "      <td>165.000000</td>\n",
              "      <td>165.000000</td>\n",
              "      <td>165.000000</td>\n",
              "      <td>165.000000</td>\n",
              "      <td>165.000000</td>\n",
              "      <td>165.000000</td>\n",
              "      <td>165.000000</td>\n",
              "      <td>165.000000</td>\n",
              "      <td>165.000000</td>\n",
              "      <td>165.000000</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>8 rows × 26 columns</p>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-6d79f9e9-c378-47e2-9b72-fb4005b9d2ad')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-6d79f9e9-c378-47e2-9b72-fb4005b9d2ad button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-6d79f9e9-c378-47e2-9b72-fb4005b9d2ad');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 134
        }
      ],
      "source": [
        "large_df.describe()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-LghiXSTbdks"
      },
      "source": [
        "#### Saving & loading\n",
        "Pandas can save `DataFrame`s to various backends, including file formats such as CSV, Excel, JSON, HTML and HDF5, or to a SQL database. Let's create a `DataFrame` to demonstrate this:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 135,
      "metadata": {
        "id": "F6dGE_DDbdks",
        "outputId": "9dcd8f4a-ce2c-4bed-90f1-4be820965e74",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 112
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         hobby  weight  birthyear  children\n",
              "alice   Biking    68.5       1985       NaN\n",
              "bob    Dancing    83.1       1984       3.0"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-7c246183-c0fa-434b-9971-66af874c0421\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>hobby</th>\n",
              "      <th>weight</th>\n",
              "      <th>birthyear</th>\n",
              "      <th>children</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>Biking</td>\n",
              "      <td>68.5</td>\n",
              "      <td>1985</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>Dancing</td>\n",
              "      <td>83.1</td>\n",
              "      <td>1984</td>\n",
              "      <td>3.0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-7c246183-c0fa-434b-9971-66af874c0421')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-7c246183-c0fa-434b-9971-66af874c0421 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-7c246183-c0fa-434b-9971-66af874c0421');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 135
        }
      ],
      "source": [
        "my_df = pd.DataFrame(\n",
        "    [[\"Biking\", 68.5, 1985, np.nan], [\"Dancing\", 83.1, 1984, 3]], \n",
        "    columns=[\"hobby\",\"weight\",\"birthyear\",\"children\"],\n",
        "    index=[\"alice\", \"bob\"]\n",
        ")\n",
        "my_df"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "cjas67GYbdks"
      },
      "source": [
        "#### Saving\n",
        "Let's save it to CSV, HTML and JSON:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 136,
      "metadata": {
        "id": "JUW20lWIbdku"
      },
      "outputs": [],
      "source": [
        "my_df.to_csv(\"my_df.csv\")\n",
        "my_df.to_html(\"my_df.html\")\n",
        "my_df.to_json(\"my_df.json\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wkwqS47Obdku"
      },
      "source": [
        "Done! Let's take a peek at what was saved:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 137,
      "metadata": {
        "id": "XiXTvwh6bdku",
        "outputId": "ee3dd42c-d24d-4659-9abb-9979d4a83650",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "# my_df.csv\n",
            ",hobby,weight,birthyear,children\n",
            "alice,Biking,68.5,1985,\n",
            "bob,Dancing,83.1,1984,3.0\n",
            "\n",
            "\n",
            "# my_df.html\n",
            "<table border=\"1\" class=\"dataframe\">\n",
            "  <thead>\n",
            "    <tr style=\"text-align: right;\">\n",
            "      <th></th>\n",
            "      <th>hobby</th>\n",
            "      <th>weight</th>\n",
            "      <th>birthyear</th>\n",
            "      <th>children</th>\n",
            "    </tr>\n",
            "  </thead>\n",
            "  <tbody>\n",
            "    <tr>\n",
            "      <th>alice</th>\n",
            "      <td>Biking</td>\n",
            "      <td>68.5</td>\n",
            "      <td>1985</td>\n",
            "      <td>NaN</td>\n",
            "    </tr>\n",
            "    <tr>\n",
            "      <th>bob</th>\n",
            "      <td>Dancing</td>\n",
            "      <td>83.1</td>\n",
            "      <td>1984</td>\n",
            "      <td>3.0</td>\n",
            "    </tr>\n",
            "  </tbody>\n",
            "</table>\n",
            "\n",
            "# my_df.json\n",
            "{\"hobby\":{\"alice\":\"Biking\",\"bob\":\"Dancing\"},\"weight\":{\"alice\":68.5,\"bob\":83.1},\"birthyear\":{\"alice\":1985,\"bob\":1984},\"children\":{\"alice\":null,\"bob\":3.0}}\n",
            "\n"
          ]
        }
      ],
      "source": [
        "for filename in (\"my_df.csv\", \"my_df.html\", \"my_df.json\"):\n",
        "    print(\"#\", filename)\n",
        "    with open(filename, \"rt\") as f:\n",
        "        print(f.read())\n",
        "        print()\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "zX9DuLcJbdkv"
      },
      "source": [
        "Note that the index is saved as the first column (with no name) in a CSV file, as `<th>` tags in HTML and as keys in JSON.\n",
        "\n",
        "Saving to other formats works very similarly, but some formats require extra libraries to be installed. For example, saving to Excel requires the openpyxl library:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 138,
      "metadata": {
        "id": "DUfZubs3bdkv"
      },
      "outputs": [],
      "source": [
        "try:\n",
        "    my_df.to_excel(\"my_df.xlsx\", sheet_name='People')\n",
        "except ImportError as e:\n",
        "    print(e)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "A92UjaI8bdkv"
      },
      "source": [
        "#### Loading\n",
        "Now let's load our CSV file back into a `DataFrame`:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 139,
      "metadata": {
        "id": "6HKE0X1Ybdkv",
        "outputId": "ae0a5a30-f563-41f9-d4c8-5aa0e379de02",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 112
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "         hobby  weight  birthyear  children\n",
              "alice   Biking    68.5       1985       NaN\n",
              "bob    Dancing    83.1       1984       3.0"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-592178bf-c9ec-4dbd-8483-24c4a3ff28eb\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>hobby</th>\n",
              "      <th>weight</th>\n",
              "      <th>birthyear</th>\n",
              "      <th>children</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>alice</th>\n",
              "      <td>Biking</td>\n",
              "      <td>68.5</td>\n",
              "      <td>1985</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>bob</th>\n",
              "      <td>Dancing</td>\n",
              "      <td>83.1</td>\n",
              "      <td>1984</td>\n",
              "      <td>3.0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-592178bf-c9ec-4dbd-8483-24c4a3ff28eb')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-592178bf-c9ec-4dbd-8483-24c4a3ff28eb button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-592178bf-c9ec-4dbd-8483-24c4a3ff28eb');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 139
        }
      ],
      "source": [
        "my_df_loaded = pd.read_csv(\"my_df.csv\", index_col=0)\n",
        "my_df_loaded"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-cs24XxSbdkv"
      },
      "source": [
        "As you might guess, there are similar `read_json`, `read_html`, `read_excel` functions as well.  We can also read data straight from the Internet. For example, let's load the top 1,000 U.S. cities from github:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 140,
      "metadata": {
        "id": "J6WRgAnObdkv",
        "outputId": "8e7cf6f5-52e7-4c33-c803-f24b0a60a648",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 238
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "                     State  Population        lat         lon\n",
              "City                                                         \n",
              "Marysville      Washington       63269  48.051764 -122.177082\n",
              "Perris          California       72326  33.782519 -117.228648\n",
              "Cleveland             Ohio      390113  41.499320  -81.694361\n",
              "Worcester    Massachusetts      182544  42.262593  -71.802293\n",
              "Columbia    South Carolina      133358  34.000710  -81.034814"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-4b789c6b-a5d8-43cd-891a-77c0edc3bebb\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>State</th>\n",
              "      <th>Population</th>\n",
              "      <th>lat</th>\n",
              "      <th>lon</th>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>City</th>\n",
              "      <th></th>\n",
              "      <th></th>\n",
              "      <th></th>\n",
              "      <th></th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>Marysville</th>\n",
              "      <td>Washington</td>\n",
              "      <td>63269</td>\n",
              "      <td>48.051764</td>\n",
              "      <td>-122.177082</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>Perris</th>\n",
              "      <td>California</td>\n",
              "      <td>72326</td>\n",
              "      <td>33.782519</td>\n",
              "      <td>-117.228648</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>Cleveland</th>\n",
              "      <td>Ohio</td>\n",
              "      <td>390113</td>\n",
              "      <td>41.499320</td>\n",
              "      <td>-81.694361</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>Worcester</th>\n",
              "      <td>Massachusetts</td>\n",
              "      <td>182544</td>\n",
              "      <td>42.262593</td>\n",
              "      <td>-71.802293</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>Columbia</th>\n",
              "      <td>South Carolina</td>\n",
              "      <td>133358</td>\n",
              "      <td>34.000710</td>\n",
              "      <td>-81.034814</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-4b789c6b-a5d8-43cd-891a-77c0edc3bebb')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-4b789c6b-a5d8-43cd-891a-77c0edc3bebb button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-4b789c6b-a5d8-43cd-891a-77c0edc3bebb');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 140
        }
      ],
      "source": [
        "us_cities = None\n",
        "try:\n",
        "    csv_url = \"https://raw.githubusercontent.com/plotly/datasets/master/us-cities-top-1k.csv\"\n",
        "    us_cities = pd.read_csv(csv_url, index_col=0)\n",
        "    us_cities = us_cities.head()\n",
        "except IOError as e:\n",
        "    print(e)\n",
        "us_cities"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "XpWNQD_Jbdkv"
      },
      "source": [
        "There are more options available, in particular regarding datetime format. Check out the [documentation](http://pandas.pydata.org/pandas-docs/stable/io.html) for more details."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "yHIObss4bdkv"
      },
      "source": [
        "#### Combining `DataFrame`s\n",
        "\n",
        "One powerful feature of pandas is it's ability to perform SQL-like joins on `DataFrame`s. Various types of joins are supported: inner joins, left/right outer joins and full joins. To illustrate this, let's start by creating a couple simple `DataFrame`s:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 141,
      "metadata": {
        "id": "TgNPwsexbdkw",
        "outputId": "a262863a-4c42-4be4-e9f7-b4bb1fcdd8b5",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 206
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "  state            city        lat         lng\n",
              "0    CA   San Francisco  37.781334 -122.416728\n",
              "1    NY        New York  40.705649  -74.008344\n",
              "2    FL           Miami  25.791100  -80.320733\n",
              "3    OH       Cleveland  41.473508  -81.739791\n",
              "4    UT  Salt Lake City  40.755851 -111.896657"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-e7a83929-ec06-4dd8-ac8f-bce683a27d07\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>state</th>\n",
              "      <th>city</th>\n",
              "      <th>lat</th>\n",
              "      <th>lng</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>CA</td>\n",
              "      <td>San Francisco</td>\n",
              "      <td>37.781334</td>\n",
              "      <td>-122.416728</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>NY</td>\n",
              "      <td>New York</td>\n",
              "      <td>40.705649</td>\n",
              "      <td>-74.008344</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>FL</td>\n",
              "      <td>Miami</td>\n",
              "      <td>25.791100</td>\n",
              "      <td>-80.320733</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>OH</td>\n",
              "      <td>Cleveland</td>\n",
              "      <td>41.473508</td>\n",
              "      <td>-81.739791</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>UT</td>\n",
              "      <td>Salt Lake City</td>\n",
              "      <td>40.755851</td>\n",
              "      <td>-111.896657</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-e7a83929-ec06-4dd8-ac8f-bce683a27d07')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-e7a83929-ec06-4dd8-ac8f-bce683a27d07 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-e7a83929-ec06-4dd8-ac8f-bce683a27d07');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 141
        }
      ],
      "source": [
        "city_loc = pd.DataFrame(\n",
        "    [\n",
        "        [\"CA\", \"San Francisco\", 37.781334, -122.416728],\n",
        "        [\"NY\", \"New York\", 40.705649, -74.008344],\n",
        "        [\"FL\", \"Miami\", 25.791100, -80.320733],\n",
        "        [\"OH\", \"Cleveland\", 41.473508, -81.739791],\n",
        "        [\"UT\", \"Salt Lake City\", 40.755851, -111.896657]\n",
        "    ], columns=[\"state\", \"city\", \"lat\", \"lng\"])\n",
        "city_loc"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 142,
      "metadata": {
        "id": "-F2yDn3cbdkw",
        "outputId": "0b39866c-fd49-4124-8bf9-26995aab820e",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 175
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "   population           city       state\n",
              "3      808976  San Francisco  California\n",
              "4     8363710       New York    New-York\n",
              "5      413201          Miami     Florida\n",
              "6     2242193        Houston       Texas"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-c0783f2c-ae37-4c60-8f20-659000b7b317\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>population</th>\n",
              "      <th>city</th>\n",
              "      <th>state</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>808976</td>\n",
              "      <td>San Francisco</td>\n",
              "      <td>California</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>8363710</td>\n",
              "      <td>New York</td>\n",
              "      <td>New-York</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>5</th>\n",
              "      <td>413201</td>\n",
              "      <td>Miami</td>\n",
              "      <td>Florida</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>6</th>\n",
              "      <td>2242193</td>\n",
              "      <td>Houston</td>\n",
              "      <td>Texas</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-c0783f2c-ae37-4c60-8f20-659000b7b317')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-c0783f2c-ae37-4c60-8f20-659000b7b317 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-c0783f2c-ae37-4c60-8f20-659000b7b317');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 142
        }
      ],
      "source": [
        "city_pop = pd.DataFrame(\n",
        "    [\n",
        "        [808976, \"San Francisco\", \"California\"],\n",
        "        [8363710, \"New York\", \"New-York\"],\n",
        "        [413201, \"Miami\", \"Florida\"],\n",
        "        [2242193, \"Houston\", \"Texas\"]\n",
        "    ], index=[3,4,5,6], columns=[\"population\", \"city\", \"state\"])\n",
        "city_pop"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "e767etKZbdkw"
      },
      "source": [
        "Now let's join these `DataFrame`s using the `merge()` function:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 143,
      "metadata": {
        "id": "Mdztg6KPbdkw",
        "outputId": "fa5d35a5-120b-47d2-c513-027739974d54",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 143
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "  state_x           city        lat         lng  population     state_y\n",
              "0      CA  San Francisco  37.781334 -122.416728      808976  California\n",
              "1      NY       New York  40.705649  -74.008344     8363710    New-York\n",
              "2      FL          Miami  25.791100  -80.320733      413201     Florida"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-ad6a87de-f1cf-4feb-9023-69bddcabaec2\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>state_x</th>\n",
              "      <th>city</th>\n",
              "      <th>lat</th>\n",
              "      <th>lng</th>\n",
              "      <th>population</th>\n",
              "      <th>state_y</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>CA</td>\n",
              "      <td>San Francisco</td>\n",
              "      <td>37.781334</td>\n",
              "      <td>-122.416728</td>\n",
              "      <td>808976</td>\n",
              "      <td>California</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>NY</td>\n",
              "      <td>New York</td>\n",
              "      <td>40.705649</td>\n",
              "      <td>-74.008344</td>\n",
              "      <td>8363710</td>\n",
              "      <td>New-York</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>FL</td>\n",
              "      <td>Miami</td>\n",
              "      <td>25.791100</td>\n",
              "      <td>-80.320733</td>\n",
              "      <td>413201</td>\n",
              "      <td>Florida</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-ad6a87de-f1cf-4feb-9023-69bddcabaec2')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-ad6a87de-f1cf-4feb-9023-69bddcabaec2 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-ad6a87de-f1cf-4feb-9023-69bddcabaec2');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 143
        }
      ],
      "source": [
        "pd.merge(left=city_loc, right=city_pop, on=\"city\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "0aIeJRFNbdkw"
      },
      "source": [
        "Note that both `DataFrame`s have a column named `state`, so in the result they got renamed to `state_x` and `state_y`.\n",
        "\n",
        "Also, note that Cleveland, Salt Lake City and Houston were dropped because they don't exist in *both* `DataFrame`s. This is the equivalent of a SQL `INNER JOIN`. If you want a `FULL OUTER JOIN`, where no city gets dropped and `NaN` values are added, you must specify `how=\"outer\"`:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 144,
      "metadata": {
        "id": "5p98Bdybbdkw",
        "outputId": "ccdfddce-9760-464b-d623-08d63cecafd8",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 238
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "  state_x            city        lat         lng  population     state_y\n",
              "0      CA   San Francisco  37.781334 -122.416728    808976.0  California\n",
              "1      NY        New York  40.705649  -74.008344   8363710.0    New-York\n",
              "2      FL           Miami  25.791100  -80.320733    413201.0     Florida\n",
              "3      OH       Cleveland  41.473508  -81.739791         NaN         NaN\n",
              "4      UT  Salt Lake City  40.755851 -111.896657         NaN         NaN\n",
              "5     NaN         Houston        NaN         NaN   2242193.0       Texas"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-7207041b-315b-43b6-97bf-41b3c5484867\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>state_x</th>\n",
              "      <th>city</th>\n",
              "      <th>lat</th>\n",
              "      <th>lng</th>\n",
              "      <th>population</th>\n",
              "      <th>state_y</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>CA</td>\n",
              "      <td>San Francisco</td>\n",
              "      <td>37.781334</td>\n",
              "      <td>-122.416728</td>\n",
              "      <td>808976.0</td>\n",
              "      <td>California</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>NY</td>\n",
              "      <td>New York</td>\n",
              "      <td>40.705649</td>\n",
              "      <td>-74.008344</td>\n",
              "      <td>8363710.0</td>\n",
              "      <td>New-York</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>FL</td>\n",
              "      <td>Miami</td>\n",
              "      <td>25.791100</td>\n",
              "      <td>-80.320733</td>\n",
              "      <td>413201.0</td>\n",
              "      <td>Florida</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>OH</td>\n",
              "      <td>Cleveland</td>\n",
              "      <td>41.473508</td>\n",
              "      <td>-81.739791</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>UT</td>\n",
              "      <td>Salt Lake City</td>\n",
              "      <td>40.755851</td>\n",
              "      <td>-111.896657</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>5</th>\n",
              "      <td>NaN</td>\n",
              "      <td>Houston</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>2242193.0</td>\n",
              "      <td>Texas</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-7207041b-315b-43b6-97bf-41b3c5484867')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-7207041b-315b-43b6-97bf-41b3c5484867 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-7207041b-315b-43b6-97bf-41b3c5484867');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 144
        }
      ],
      "source": [
        "all_cities = pd.merge(left=city_loc, right=city_pop, on=\"city\", how=\"outer\")\n",
        "all_cities"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9InNQCuSbdkw"
      },
      "source": [
        "Of course `LEFT OUTER JOIN` is also available by setting `how=\"left\"`: only the cities present in the left `DataFrame` end up in the result. Similarly, with `how=\"right\"` only cities in the right `DataFrame` appear in the result. For example:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 145,
      "metadata": {
        "id": "eJC2h_mAbdkw",
        "outputId": "3aed3310-57eb-4f63-dce9-072910b94a0a",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 175
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "  state_x           city        lat         lng  population     state_y\n",
              "0      CA  San Francisco  37.781334 -122.416728      808976  California\n",
              "1      NY       New York  40.705649  -74.008344     8363710    New-York\n",
              "2      FL          Miami  25.791100  -80.320733      413201     Florida\n",
              "3     NaN        Houston        NaN         NaN     2242193       Texas"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-c08a9d2b-92c7-4932-8407-120e376ca444\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>state_x</th>\n",
              "      <th>city</th>\n",
              "      <th>lat</th>\n",
              "      <th>lng</th>\n",
              "      <th>population</th>\n",
              "      <th>state_y</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>CA</td>\n",
              "      <td>San Francisco</td>\n",
              "      <td>37.781334</td>\n",
              "      <td>-122.416728</td>\n",
              "      <td>808976</td>\n",
              "      <td>California</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>NY</td>\n",
              "      <td>New York</td>\n",
              "      <td>40.705649</td>\n",
              "      <td>-74.008344</td>\n",
              "      <td>8363710</td>\n",
              "      <td>New-York</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>FL</td>\n",
              "      <td>Miami</td>\n",
              "      <td>25.791100</td>\n",
              "      <td>-80.320733</td>\n",
              "      <td>413201</td>\n",
              "      <td>Florida</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>NaN</td>\n",
              "      <td>Houston</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>2242193</td>\n",
              "      <td>Texas</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-c08a9d2b-92c7-4932-8407-120e376ca444')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-c08a9d2b-92c7-4932-8407-120e376ca444 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-c08a9d2b-92c7-4932-8407-120e376ca444');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 145
        }
      ],
      "source": [
        "pd.merge(left=city_loc, right=city_pop, on=\"city\", how=\"right\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Wwp527vqbdkx"
      },
      "source": [
        "If the key to join on is actually in one (or both) `DataFrame`'s index, you must use `left_index=True` and/or `right_index=True`. If the key column names differ, you must use `left_on` and `right_on`. For example:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 146,
      "metadata": {
        "id": "t7TB757Ibdkx",
        "outputId": "100b43cb-0198-4028-a911-f71ec3d20697",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 143
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "  state_x           city        lat         lng  population           name  \\\n",
              "0      CA  San Francisco  37.781334 -122.416728      808976  San Francisco   \n",
              "1      NY       New York  40.705649  -74.008344     8363710       New York   \n",
              "2      FL          Miami  25.791100  -80.320733      413201          Miami   \n",
              "\n",
              "      state_y  \n",
              "0  California  \n",
              "1    New-York  \n",
              "2     Florida  "
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-b091e1c2-b653-43a9-8208-70b7342e5f3d\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>state_x</th>\n",
              "      <th>city</th>\n",
              "      <th>lat</th>\n",
              "      <th>lng</th>\n",
              "      <th>population</th>\n",
              "      <th>name</th>\n",
              "      <th>state_y</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>CA</td>\n",
              "      <td>San Francisco</td>\n",
              "      <td>37.781334</td>\n",
              "      <td>-122.416728</td>\n",
              "      <td>808976</td>\n",
              "      <td>San Francisco</td>\n",
              "      <td>California</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>NY</td>\n",
              "      <td>New York</td>\n",
              "      <td>40.705649</td>\n",
              "      <td>-74.008344</td>\n",
              "      <td>8363710</td>\n",
              "      <td>New York</td>\n",
              "      <td>New-York</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>FL</td>\n",
              "      <td>Miami</td>\n",
              "      <td>25.791100</td>\n",
              "      <td>-80.320733</td>\n",
              "      <td>413201</td>\n",
              "      <td>Miami</td>\n",
              "      <td>Florida</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-b091e1c2-b653-43a9-8208-70b7342e5f3d')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-b091e1c2-b653-43a9-8208-70b7342e5f3d button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-b091e1c2-b653-43a9-8208-70b7342e5f3d');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 146
        }
      ],
      "source": [
        "city_pop2 = city_pop.copy()\n",
        "city_pop2.columns = [\"population\", \"name\", \"state\"]\n",
        "pd.merge(left=city_loc, right=city_pop2, left_on=\"city\", right_on=\"name\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "GdkK0fRRbdkx"
      },
      "source": [
        "#### Concatenation\n",
        "Rather than joining `DataFrame`s, we may just want to concatenate them. That's what `concat()` is for:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 147,
      "metadata": {
        "id": "OB8vX0C7bdkx",
        "outputId": "a074f2b3-52f3-4491-cf15-c540c5ab310e",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 332
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "        state            city        lat         lng  population\n",
              "0          CA   San Francisco  37.781334 -122.416728         NaN\n",
              "1          NY        New York  40.705649  -74.008344         NaN\n",
              "2          FL           Miami  25.791100  -80.320733         NaN\n",
              "3          OH       Cleveland  41.473508  -81.739791         NaN\n",
              "4          UT  Salt Lake City  40.755851 -111.896657         NaN\n",
              "3  California   San Francisco        NaN         NaN    808976.0\n",
              "4    New-York        New York        NaN         NaN   8363710.0\n",
              "5     Florida           Miami        NaN         NaN    413201.0\n",
              "6       Texas         Houston        NaN         NaN   2242193.0"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-a8e0d903-430a-4aab-baab-066ec351e75b\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>state</th>\n",
              "      <th>city</th>\n",
              "      <th>lat</th>\n",
              "      <th>lng</th>\n",
              "      <th>population</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>CA</td>\n",
              "      <td>San Francisco</td>\n",
              "      <td>37.781334</td>\n",
              "      <td>-122.416728</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>NY</td>\n",
              "      <td>New York</td>\n",
              "      <td>40.705649</td>\n",
              "      <td>-74.008344</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>FL</td>\n",
              "      <td>Miami</td>\n",
              "      <td>25.791100</td>\n",
              "      <td>-80.320733</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>OH</td>\n",
              "      <td>Cleveland</td>\n",
              "      <td>41.473508</td>\n",
              "      <td>-81.739791</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>UT</td>\n",
              "      <td>Salt Lake City</td>\n",
              "      <td>40.755851</td>\n",
              "      <td>-111.896657</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>California</td>\n",
              "      <td>San Francisco</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>808976.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>New-York</td>\n",
              "      <td>New York</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>8363710.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>5</th>\n",
              "      <td>Florida</td>\n",
              "      <td>Miami</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>413201.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>6</th>\n",
              "      <td>Texas</td>\n",
              "      <td>Houston</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>2242193.0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-a8e0d903-430a-4aab-baab-066ec351e75b')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-a8e0d903-430a-4aab-baab-066ec351e75b button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-a8e0d903-430a-4aab-baab-066ec351e75b');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 147
        }
      ],
      "source": [
        "result_concat = pd.concat([city_loc, city_pop])\n",
        "result_concat"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mNaralK8bdkx"
      },
      "source": [
        "Note that this operation aligned the data horizontally (by columns) but not vertically (by rows). In this example, we end up with multiple rows having the same index (eg. 3). Pandas handles this rather gracefully:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 148,
      "metadata": {
        "id": "VDFrmNMGbdkx",
        "outputId": "a8e2c6b4-9ccd-49eb-bca3-add0cdc858c8",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 112
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "        state           city        lat        lng  population\n",
              "3          OH      Cleveland  41.473508 -81.739791         NaN\n",
              "3  California  San Francisco        NaN        NaN    808976.0"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-f6e56405-2fad-4a83-ae29-2c9db1b0918a\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>state</th>\n",
              "      <th>city</th>\n",
              "      <th>lat</th>\n",
              "      <th>lng</th>\n",
              "      <th>population</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>OH</td>\n",
              "      <td>Cleveland</td>\n",
              "      <td>41.473508</td>\n",
              "      <td>-81.739791</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>California</td>\n",
              "      <td>San Francisco</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>808976.0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-f6e56405-2fad-4a83-ae29-2c9db1b0918a')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-f6e56405-2fad-4a83-ae29-2c9db1b0918a button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-f6e56405-2fad-4a83-ae29-2c9db1b0918a');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 148
        }
      ],
      "source": [
        "result_concat.loc[3]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "oSp4G88tbdkx"
      },
      "source": [
        "Or you can tell pandas to just ignore the index:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 149,
      "metadata": {
        "id": "BrVK8LtDbdkx",
        "outputId": "e601d609-30e6-4bda-8e70-fbc317e74477",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 332
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "        state            city        lat         lng  population\n",
              "0          CA   San Francisco  37.781334 -122.416728         NaN\n",
              "1          NY        New York  40.705649  -74.008344         NaN\n",
              "2          FL           Miami  25.791100  -80.320733         NaN\n",
              "3          OH       Cleveland  41.473508  -81.739791         NaN\n",
              "4          UT  Salt Lake City  40.755851 -111.896657         NaN\n",
              "5  California   San Francisco        NaN         NaN    808976.0\n",
              "6    New-York        New York        NaN         NaN   8363710.0\n",
              "7     Florida           Miami        NaN         NaN    413201.0\n",
              "8       Texas         Houston        NaN         NaN   2242193.0"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-b49e4beb-eb63-42a8-8570-851fc492187d\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>state</th>\n",
              "      <th>city</th>\n",
              "      <th>lat</th>\n",
              "      <th>lng</th>\n",
              "      <th>population</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>CA</td>\n",
              "      <td>San Francisco</td>\n",
              "      <td>37.781334</td>\n",
              "      <td>-122.416728</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>NY</td>\n",
              "      <td>New York</td>\n",
              "      <td>40.705649</td>\n",
              "      <td>-74.008344</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>FL</td>\n",
              "      <td>Miami</td>\n",
              "      <td>25.791100</td>\n",
              "      <td>-80.320733</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>OH</td>\n",
              "      <td>Cleveland</td>\n",
              "      <td>41.473508</td>\n",
              "      <td>-81.739791</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>UT</td>\n",
              "      <td>Salt Lake City</td>\n",
              "      <td>40.755851</td>\n",
              "      <td>-111.896657</td>\n",
              "      <td>NaN</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>5</th>\n",
              "      <td>California</td>\n",
              "      <td>San Francisco</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>808976.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>6</th>\n",
              "      <td>New-York</td>\n",
              "      <td>New York</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>8363710.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>7</th>\n",
              "      <td>Florida</td>\n",
              "      <td>Miami</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>413201.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>8</th>\n",
              "      <td>Texas</td>\n",
              "      <td>Houston</td>\n",
              "      <td>NaN</td>\n",
              "      <td>NaN</td>\n",
              "      <td>2242193.0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-b49e4beb-eb63-42a8-8570-851fc492187d')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-b49e4beb-eb63-42a8-8570-851fc492187d button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-b49e4beb-eb63-42a8-8570-851fc492187d');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 149
        }
      ],
      "source": [
        "pd.concat([city_loc, city_pop], ignore_index=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "rHrwIwkobdkx"
      },
      "source": [
        "Notice that when a column does not exist in a `DataFrame`, it acts as if it was filled with `NaN` values. If we set `join=\"inner\"`, then only columns that exist in *both* `DataFrame`s are returned:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 150,
      "metadata": {
        "id": "FlkwidKubdky",
        "outputId": "1e33194a-f709-4fe7-fde7-ba3282f6ef94",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 332
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "        state            city\n",
              "0          CA   San Francisco\n",
              "1          NY        New York\n",
              "2          FL           Miami\n",
              "3          OH       Cleveland\n",
              "4          UT  Salt Lake City\n",
              "3  California   San Francisco\n",
              "4    New-York        New York\n",
              "5     Florida           Miami\n",
              "6       Texas         Houston"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-c0fbfce4-7011-4c9c-b208-64aacaf4976a\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>state</th>\n",
              "      <th>city</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>CA</td>\n",
              "      <td>San Francisco</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>NY</td>\n",
              "      <td>New York</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>FL</td>\n",
              "      <td>Miami</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>OH</td>\n",
              "      <td>Cleveland</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>UT</td>\n",
              "      <td>Salt Lake City</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>California</td>\n",
              "      <td>San Francisco</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>New-York</td>\n",
              "      <td>New York</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>5</th>\n",
              "      <td>Florida</td>\n",
              "      <td>Miami</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>6</th>\n",
              "      <td>Texas</td>\n",
              "      <td>Houston</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-c0fbfce4-7011-4c9c-b208-64aacaf4976a')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-c0fbfce4-7011-4c9c-b208-64aacaf4976a button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-c0fbfce4-7011-4c9c-b208-64aacaf4976a');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 150
        }
      ],
      "source": [
        "pd.concat([city_loc, city_pop], join=\"inner\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fMXy67Thbdkz"
      },
      "source": [
        "#### Categories\n",
        "It is quite frequent to have values that represent categories, for example `1` for female and `2` for male, or `\"A\"` for Good, `\"B\"` for Average, `\"C\"` for Bad. These categorical values can be hard to read and cumbersome to handle, but fortunately pandas makes it easy. To illustrate this, let's take the `city_pop` `DataFrame` we created earlier, and add a column that represents a category:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 151,
      "metadata": {
        "id": "ca9MbHkJbdkz",
        "outputId": "0f9909f0-c306-4e30-a606-a02af7f72f5f",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 175
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "   population           city       state  eco_code\n",
              "3      808976  San Francisco  California        17\n",
              "4     8363710       New York    New-York        17\n",
              "5      413201          Miami     Florida        34\n",
              "6     2242193        Houston       Texas        20"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-619f0856-b825-4daf-a555-94f42c040961\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>population</th>\n",
              "      <th>city</th>\n",
              "      <th>state</th>\n",
              "      <th>eco_code</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>808976</td>\n",
              "      <td>San Francisco</td>\n",
              "      <td>California</td>\n",
              "      <td>17</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>8363710</td>\n",
              "      <td>New York</td>\n",
              "      <td>New-York</td>\n",
              "      <td>17</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>5</th>\n",
              "      <td>413201</td>\n",
              "      <td>Miami</td>\n",
              "      <td>Florida</td>\n",
              "      <td>34</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>6</th>\n",
              "      <td>2242193</td>\n",
              "      <td>Houston</td>\n",
              "      <td>Texas</td>\n",
              "      <td>20</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-619f0856-b825-4daf-a555-94f42c040961')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-619f0856-b825-4daf-a555-94f42c040961 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-619f0856-b825-4daf-a555-94f42c040961');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 151
        }
      ],
      "source": [
        "city_eco = city_pop.copy()\n",
        "city_eco[\"eco_code\"] = [17, 17, 34, 20]\n",
        "city_eco"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Z8kMBnYCbdkz"
      },
      "source": [
        "Right now the `eco_code` column is full of apparently meaningless codes. Let's fix that. First, we will create a new categorical column based on the `eco_code`s:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 152,
      "metadata": {
        "id": "3Vvut-u9bdkz",
        "outputId": "f731bdc0-5498-48bf-dc40-78345578e38c",
        "colab": {
          "base_uri": "https://localhost:8080/"
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "Int64Index([17, 20, 34], dtype='int64')"
            ]
          },
          "metadata": {},
          "execution_count": 152
        }
      ],
      "source": [
        "city_eco[\"economy\"] = city_eco[\"eco_code\"].astype('category')\n",
        "city_eco[\"economy\"].cat.categories"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "8NwGaKhebdkz"
      },
      "source": [
        "Now we can give each category a meaningful name:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 153,
      "metadata": {
        "id": "NEJS59vnbdkz",
        "outputId": "ef85fa21-7e8b-49ed-ceb8-56a8982e8e75",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 175
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "   population           city       state  eco_code  economy\n",
              "3      808976  San Francisco  California        17  Finance\n",
              "4     8363710       New York    New-York        17  Finance\n",
              "5      413201          Miami     Florida        34  Tourism\n",
              "6     2242193        Houston       Texas        20   Energy"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-e38d8915-d626-4ea7-a947-bb62cdfd88dc\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>population</th>\n",
              "      <th>city</th>\n",
              "      <th>state</th>\n",
              "      <th>eco_code</th>\n",
              "      <th>economy</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>808976</td>\n",
              "      <td>San Francisco</td>\n",
              "      <td>California</td>\n",
              "      <td>17</td>\n",
              "      <td>Finance</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>8363710</td>\n",
              "      <td>New York</td>\n",
              "      <td>New-York</td>\n",
              "      <td>17</td>\n",
              "      <td>Finance</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>5</th>\n",
              "      <td>413201</td>\n",
              "      <td>Miami</td>\n",
              "      <td>Florida</td>\n",
              "      <td>34</td>\n",
              "      <td>Tourism</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>6</th>\n",
              "      <td>2242193</td>\n",
              "      <td>Houston</td>\n",
              "      <td>Texas</td>\n",
              "      <td>20</td>\n",
              "      <td>Energy</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-e38d8915-d626-4ea7-a947-bb62cdfd88dc')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-e38d8915-d626-4ea7-a947-bb62cdfd88dc button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-e38d8915-d626-4ea7-a947-bb62cdfd88dc');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 153
        }
      ],
      "source": [
        "city_eco[\"economy\"].cat.categories = [\"Finance\", \"Energy\", \"Tourism\"]\n",
        "city_eco"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "b5l_n08-bdkz"
      },
      "source": [
        "Note that categorical values are sorted according to their categorical order, *not* their alphabetical order:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 154,
      "metadata": {
        "id": "r5e0jca6bdkz",
        "outputId": "7470c030-bcc8-475b-e59b-0d2348f38741",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 175
        }
      },
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "   population           city       state  eco_code  economy\n",
              "5      413201          Miami     Florida        34  Tourism\n",
              "6     2242193        Houston       Texas        20   Energy\n",
              "3      808976  San Francisco  California        17  Finance\n",
              "4     8363710       New York    New-York        17  Finance"
            ],
            "text/html": [
              "\n",
              "  <div id=\"df-440b5a5a-7999-4f2e-9cf6-46ae4a7f429c\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>population</th>\n",
              "      <th>city</th>\n",
              "      <th>state</th>\n",
              "      <th>eco_code</th>\n",
              "      <th>economy</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>5</th>\n",
              "      <td>413201</td>\n",
              "      <td>Miami</td>\n",
              "      <td>Florida</td>\n",
              "      <td>34</td>\n",
              "      <td>Tourism</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>6</th>\n",
              "      <td>2242193</td>\n",
              "      <td>Houston</td>\n",
              "      <td>Texas</td>\n",
              "      <td>20</td>\n",
              "      <td>Energy</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>808976</td>\n",
              "      <td>San Francisco</td>\n",
              "      <td>California</td>\n",
              "      <td>17</td>\n",
              "      <td>Finance</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>8363710</td>\n",
              "      <td>New York</td>\n",
              "      <td>New-York</td>\n",
              "      <td>17</td>\n",
              "      <td>Finance</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-440b5a5a-7999-4f2e-9cf6-46ae4a7f429c')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "        \n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "      \n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-440b5a5a-7999-4f2e-9cf6-46ae4a7f429c button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-440b5a5a-7999-4f2e-9cf6-46ae4a7f429c');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n",
              "  "
            ]
          },
          "metadata": {},
          "execution_count": 154
        }
      ],
      "source": [
        "city_eco.sort_values(by=\"economy\", ascending=False)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-c0f9fnEbdk0"
      },
      "source": [
        "## What next?\n",
        "As you probably noticed by now, pandas is quite a large library with *many* features. Although we went through the most important features, there is still a lot to discover. Probably the best way to learn more is to get your hands dirty with some real-life data. It is also a good idea to go through pandas' excellent [documentation](http://pandas.pydata.org/pandas-docs/stable/index.html), in particular the [Cookbook](http://pandas.pydata.org/pandas-docs/stable/cookbook.html).\n",
        "\n",
        "You can also work with Bigquery in Panda. Check out https://googleapis.dev/python/bigquery/latest/usage/pandas.html and https://pandas-gbq.readthedocs.io/en/latest/ for more details."
      ]
    },
    {
      "cell_type": "code",
      "source": [
        ""
      ],
      "metadata": {
        "id": "MB6TYLpobQzG"
      },
      "execution_count": null,
      "outputs": []
    }
  ]
}